[01/11] Initial TI PRU GCC port

Message ID 20180613185805.7833-2-dimitar@dinux.eu
State New
Headers show
Series
  • New backend for the TI PRU processor
Related show

Commit Message

Dimitar Dimitrov June 13, 2018, 6:57 p.m.
ChangeLog:

2018-06-13  Dimitar Dimitrov  <dimitar@dinux.eu>

	* configure: Regenerate.
	* configure.ac: Add PRU target.

gcc/ChangeLog:

2018-06-13  Dimitar Dimitrov  <dimitar@dinux.eu>

	* config/pru/pru-ldst-multiple.md: Generate using pru-ldst-multiple.ml.
	* common/config/pru/pru-common.c: New file.
	* config.gcc: Add PRU target.
	* config/pru/alu-zext.md: New file.
	* config/pru/constraints.md: New file.
	* config/pru/predicates.md: New file.
	* config/pru/pru-ldst-multiple.ml: New file.
	* config/pru/pru-opts.h: New file.
	* config/pru/pru-passes.c: New file.
	* config/pru/pru-pragma.c: New file.
	* config/pru/pru-protos.h: New file.
	* config/pru/pru.c: New file.
	* config/pru/pru.h: New file.
	* config/pru/pru.md: New file.
	* config/pru/pru.opt: New file.
	* config/pru/t-pru: New file.
	* doc/extend.texi: Document PRU pragmas.
	* doc/invoke.texi: Document PRU-specific options.
	* doc/md.texi: Document PRU asm constraints.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
---
 configure.ac                        |    7 +
 gcc/common/config/pru/pru-common.c  |   36 +
 gcc/config.gcc                      |    9 +
 gcc/config/pru/alu-zext.md          |  178 +++
 gcc/config/pru/constraints.md       |   88 ++
 gcc/config/pru/predicates.md        |  220 +++
 gcc/config/pru/pru-ldst-multiple.ml |  144 ++
 gcc/config/pru/pru-opts.h           |   31 +
 gcc/config/pru/pru-passes.c         |  230 +++
 gcc/config/pru/pru-pragma.c         |   90 ++
 gcc/config/pru/pru-protos.h         |   70 +
 gcc/config/pru/pru.c                | 2985 +++++++++++++++++++++++++++++++++++
 gcc/config/pru/pru.h                |  551 +++++++
 gcc/config/pru/pru.md               |  905 +++++++++++
 gcc/config/pru/pru.opt              |   53 +
 gcc/config/pru/t-pru                |   31 +
 gcc/doc/extend.texi                 |   20 +
 gcc/doc/invoke.texi                 |   55 +
 gcc/doc/md.texi                     |   22 +
 19 files changed, 5725 insertions(+)
 create mode 100644 gcc/common/config/pru/pru-common.c
 create mode 100644 gcc/config/pru/alu-zext.md
 create mode 100644 gcc/config/pru/constraints.md
 create mode 100644 gcc/config/pru/predicates.md
 create mode 100644 gcc/config/pru/pru-ldst-multiple.ml
 create mode 100644 gcc/config/pru/pru-opts.h
 create mode 100644 gcc/config/pru/pru-passes.c
 create mode 100644 gcc/config/pru/pru-pragma.c
 create mode 100644 gcc/config/pru/pru-protos.h
 create mode 100644 gcc/config/pru/pru.c
 create mode 100644 gcc/config/pru/pru.h
 create mode 100644 gcc/config/pru/pru.md
 create mode 100644 gcc/config/pru/pru.opt
 create mode 100644 gcc/config/pru/t-pru

Comments

Joseph Myers June 13, 2018, 7:44 p.m. | #1
On Wed, 13 Jun 2018, Dimitar Dimitrov wrote:

> +      error ("__delay_cycles() only takes constant arguments");

As in documentation, diagnostics should not use () to indicate that a name 
refers to a function (as opposed to referring to a call with no 
arguments).  However, function names, option names and any other literal 
source code text in diagnostics, both this and other diagnostics in the 
port, should be enclosed in %< and %> to quote them in the diagnostic 
output.

> +         error ("__delay_cycles only takes non-negative cycle counts.");

> +         error ("__delay_cycles is limited to 32-bit loop counts.");

No '.' at end of diagnostics (and use %<__delay_cycles%>, as above).
Dimitar Dimitrov June 18, 2018, 6:45 p.m. | #2
On сряда, 13 юни 2018 г. 19:44:16 EEST Joseph Myers wrote:
> On Wed, 13 Jun 2018, Dimitar Dimitrov wrote:
> > +      error ("__delay_cycles() only takes constant arguments");
> 
> As in documentation, diagnostics should not use () to indicate that a name
> refers to a function (as opposed to referring to a call with no
> arguments).  However, function names, option names and any other literal
> source code text in diagnostics, both this and other diagnostics in the
> port, should be enclosed in %< and %> to quote them in the diagnostic
> output.
> 
> > +         error ("__delay_cycles only takes non-negative cycle counts.");
> > 
> > +         error ("__delay_cycles is limited to 32-bit loop counts.");
> 
> No '.' at end of diagnostics (and use %<__delay_cycles%>, as above).

Thank you for the prompt review. I'll fix and resend patch v2.

Regards,
Dimitar
Jeff Law June 22, 2018, 6:20 p.m. | #3
On 06/13/2018 12:57 PM, Dimitar Dimitrov wrote:
> ChangeLog:
> 
> 2018-06-13  Dimitar Dimitrov  <dimitar@dinux.eu>
> 
> 	* configure: Regenerate.
> 	* configure.ac: Add PRU target.
> 
> gcc/ChangeLog:
> 
> 2018-06-13  Dimitar Dimitrov  <dimitar@dinux.eu>
> 
> 	* config/pru/pru-ldst-multiple.md: Generate using pru-ldst-multiple.ml.
> 	* common/config/pru/pru-common.c: New file.
> 	* config.gcc: Add PRU target.
> 	* config/pru/alu-zext.md: New file.
> 	* config/pru/constraints.md: New file.
> 	* config/pru/predicates.md: New file.
> 	* config/pru/pru-ldst-multiple.ml: New file.
> 	* config/pru/pru-opts.h: New file.
> 	* config/pru/pru-passes.c: New file.
> 	* config/pru/pru-pragma.c: New file.
> 	* config/pru/pru-protos.h: New file.
> 	* config/pru/pru.c: New file.
> 	* config/pru/pru.h: New file.
> 	* config/pru/pru.md: New file.
> 	* config/pru/pru.opt: New file.
> 	* config/pru/t-pru: New file.
> 	* doc/extend.texi: Document PRU pragmas.
> 	* doc/invoke.texi: Document PRU-specific options.
> 	* doc/md.texi: Document PRU asm constraints.
Joseph has already made some notes about diagnostics.  Those will need
to be addressed.

A couple global comments on style issues.

First, each function should have a comment describing what it does,
ideally describing the input parameters and output value.

Second the function definition should always look like

[static] <type>
name (type1 arg1, type2 arg2)
{
  body
}

In some cases you've joined the linkage/type line with the function's
name.  Can you please review pru.c in particular to fix these issues.

There's been another round of rtx -> rtx_insn * conversions done in the
generic parts of the compiler.  There's a reasonable chance they will
require trivial updates to the pru port.   When the time comes for
integration the trivial changes to cope with those conversions are
pre-approved and will not require a separate review cycle.

I've already asked about your copyright assignment status.  So you know,
we can't go forward with any nontrivial contributions until the
assignment is in place.

I'm going to assume that you plan to maintain this port.  Ideally you'll
have it using an auto-builder and posting tests gcc-testresults :-0

I'm assuming that since you're patching LRA in a different patch that
you're using LRA rather than reload :-)  That also implies that you're
not using the deprecated cc0 bits, which is good.


> 
> diff --git a/configure.ac b/configure.ac
> index 28155a0e593..684a7f58515 100644
> --- a/configure.ac
> +++ b/configure.ac
[ ... ]
So it looks like you're supporting libgloss/newlib.  Does it work with
the current libgloss/newlib trunk?  I've had troubles over the last few
months with 16 bit targets.


> diff --git a/gcc/common/config/pru/pru-common.c b/gcc/common/config/pru/pru-common.c
> new file mode 100644
> index 00000000000..e87d70ce9ca
> --- /dev/null
> +++ b/gcc/common/config/pru/pru-common.c
[ ... ]
> @@ -0,0 +1,36 @@
> +
> +#undef TARGET_EXCEPT_UNWIND_INFO
> +#define TARGET_EXCEPT_UNWIND_INFO sjlj_except_unwind_info

SJLJ exceptions?  Is there some reason you're not using dwarf2 unwind
info?  It may mean adding some notes in the prologue generation code,
but otherwise I'd expect it to "just work".

> +
> +struct gcc_targetm_common targetm_common = TARGETM_COMMON_INITIALIZER;
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 8b2fd908c38..d1cddb86c36 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
[ ... ]
> +pru*-*-*)
> +	tm_file="dbxelf.h elfos.h newlib-stdint.h ${tm_file}"
> +	tmake_file="${tmake_file} pru/t-pru"
> +	extra_objs="pru-pragma.o pru-passes.o"
> +	use_gcc_stdint=wrap
Can you try to avoid dbxelf.h?  We're trying to get away from embedded
stabs.  I'd like to avoid creating new ports that reference this stuff.



> diff --git a/gcc/config/pru/pru-ldst-multiple.ml b/gcc/config/pru/pru-ldst-multiple.ml
> new file mode 100644
> index 00000000000..961a9fb33e6
> --- /dev/null
> +++ b/gcc/config/pru/pru-ldst-multiple.ml
> @@ -0,0 +1,144 @@
> +(* Auto-generate PRU load/store-multiple patterns
> +   Copyright (C) 2014-2018 Free Software Foundation, Inc.
> +   Based on arm-ldmstm.ml
[ ... ]
So please make sure to include the generated file in the repository.  We
don't want to force developers to have to have ocaml installed to build
the port.  I believe this is consistent with how the ARM port works.

> diff --git a/gcc/config/pru/pru-opts.h b/gcc/config/pru/pru-opts.h
> new file mode 100644
> index 00000000000..1c1514cb2a3
> --- /dev/null
> +++ b/gcc/config/pru/pru-opts.h
[ ... ]
> +
> +/* Definitions for option handling for PRU.  */
> +
> +#ifndef GCC_PRU_OPTS_H
> +#define GCC_PRU_OPTS_H
> +
> +/* ABI variant for code generation.  */
> +enum pru_abi {
> +    PRU_ABI_GNU,
> +    PRU_ABI_TI
Is there really any value in having two ABIs?  If there's another
toolchain out there it'd be best if we were just compatible with that
rather than defining a GNU ABI.  I guess it might be helpful if the TI
ABI is crippled enough that it's hard to run the testsuite..


> diff --git a/gcc/config/pru/pru.c b/gcc/config/pru/pru.c
> new file mode 100644
> index 00000000000..41b0a283d44
> --- /dev/null
> +++ b/gcc/config/pru/pru.c


f
> +
> +  /* Unwind tables currently require a frame pointer for correctness,
> +     see toplev.c:process_options().  */
> +
> +  if ((flag_unwind_tables
> +       || flag_non_call_exceptions
> +       || flag_asynchronous_unwind_tables)
> +      && !ACCUMULATE_OUTGOING_ARGS)
> +    {
> +      flag_omit_frame_pointer = 0;
> +    }
!?!  The comment doesn't make any sense.  Dwarf2 unwinders can certainly
handle this case.  What specific comment in process_options are you
referring to?




> +
> +/* Say that the epilogue uses the return address register.  Note that
> +   in the case of sibcalls, the values "used by the epilogue" are
> +   considered live at the start of the called function.  */
> +#define EPILOGUE_USES(REGNO) (epilogue_completed &&	      \
> +			      (((REGNO) == RA_REGNO)	      \
> +			       || (REGNO) == (RA_REGNO + 1)))
Is this really needed?  If you properly describe the dataflow for your
epilogues I don't think you need this hack.



> diff --git a/gcc/config/pru/pru.md b/gcc/config/pru/pru.md
> new file mode 100644
> index 00000000000..328fb484847
> --- /dev/null
> +++ b/gcc/config/pru/pru.md
> @@ -0,0 +1,905 @@
> +
> +; Combine phase tends to produce this expression when given the following
> +; expression: "uint dst = (ushort) op0 & (uint) op1;
> +;
> +; TODO - fix properly! Understand whether we need to fix combine core, augment
> +; our hooks, or PRU is giving incorrect costs for subexpressions.
> +(define_insn "*and_combine_zero_extend_workaround<mode>"
> + [(set (match_operand:SI 0 "register_operand"       "=r")
> +       (zero_extend:SI
> +	(subreg:EQD
> +	  (and:SI
> +	    (match_operand:SI 1 "register_operand"  "%r")
> +	    (match_operand:SI 2 "register_operand"  "rI"))
> +	  0)))]
> + ""
> + "and\\t%0, %1, %F2<EQD:regwidth>"
> + [(set_attr "type" "alu")])
You might want to revisit the need for this -- we went through a round
of fixes during the last couple release cycles to clean this stuff up.
Note I would have said something about this regardless of your TODO note
because of the subreg expression (they're certainly *valid* in machine
descriptions, but often a symptom of a problem elsewhere).  This was the
only subreg expression I saw in your md file, which is good :-)


So the biggest issue is the desire to have more than 30 operands on some
insns.  Can you describe in greater detail why that's beneficial.

jeff
Dimitar Dimitrov June 23, 2018, 3:42 p.m. | #4
On петък, 22 юни 2018 г. 12:20:46 EEST Jeff Law wrote:
> On 06/13/2018 12:57 PM, Dimitar Dimitrov wrote:
> > ChangeLog:
> > 
> > 2018-06-13  Dimitar Dimitrov  <dimitar@dinux.eu>
> > 
> > 	* configure: Regenerate.
> > 	* configure.ac: Add PRU target.
> > 
> > gcc/ChangeLog:
> > 
> > 2018-06-13  Dimitar Dimitrov  <dimitar@dinux.eu>
> > 
> > 	* config/pru/pru-ldst-multiple.md: Generate using pru-ldst-multiple.ml.
> > 	* common/config/pru/pru-common.c: New file.
> > 	* config.gcc: Add PRU target.
> > 	* config/pru/alu-zext.md: New file.
> > 	* config/pru/constraints.md: New file.
> > 	* config/pru/predicates.md: New file.
> > 	* config/pru/pru-ldst-multiple.ml: New file.
> > 	* config/pru/pru-opts.h: New file.
> > 	* config/pru/pru-passes.c: New file.
> > 	* config/pru/pru-pragma.c: New file.
> > 	* config/pru/pru-protos.h: New file.
> > 	* config/pru/pru.c: New file.
> > 	* config/pru/pru.h: New file.
> > 	* config/pru/pru.md: New file.
> > 	* config/pru/pru.opt: New file.
> > 	* config/pru/t-pru: New file.
> > 	* doc/extend.texi: Document PRU pragmas.
> > 	* doc/invoke.texi: Document PRU-specific options.
> > 	* doc/md.texi: Document PRU asm constraints.
> 
> Joseph has already made some notes about diagnostics.  Those will need
> to be addressed.
> 
> A couple global comments on style issues.
> 
> First, each function should have a comment describing what it does,
> ideally describing the input parameters and output value.
> 
> Second the function definition should always look like
> 
> [static] <type>
> name (type1 arg1, type2 arg2)
> {
>   body
> }
> 
> In some cases you've joined the linkage/type line with the function's
> name.  Can you please review pru.c in particular to fix these issues.
I'll fix these in patch v2.

> 
[...]
> 
> I've already asked about your copyright assignment status.  So you know,
> we can't go forward with any nontrivial contributions until the
> assignment is in place.
Yes, FSF has my assignment since November 2016.

> 
> I'm going to assume that you plan to maintain this port.  Ideally you'll
> have it using an auto-builder and posting tests gcc-testresults :-0
Yes, I'm willing to maintain it. To be fair, it is entirely in my own spare 
time. I've been writing, rewriting and maintaning this out-of-tree port for 
the past 4 years. I believe I'll have time and will to continue doing so for 
the foreseeable future.

Emailing the results would actually be easier for me than maintaining my own 
testresults web page. Thanks.

> 
> I'm assuming that since you're patching LRA in a different patch that
> you're using LRA rather than reload :-)  That also implies that you're
> not using the deprecated cc0 bits, which is good.
Yes, port is using LRA. No cc0.

> 
> > diff --git a/configure.ac b/configure.ac
> > index 28155a0e593..684a7f58515 100644
> > --- a/configure.ac
> > +++ b/configure.ac
> 
> [ ... ]
> So it looks like you're supporting libgloss/newlib.  Does it work with
> the current libgloss/newlib trunk?  I've had troubles over the last few
> months with 16 bit targets.
I have not detected issues with top-of-tree. But keep in mind that PRU is 8-
bit only for the GCC internal representation. The port declares efficient 32-
bit ops for SImode. From newlib's point of view, PRU is native 32-bit.

Some history here: http://gcc.gnu.org/ml/gcc/2017-01/msg00217.html

> 
> > diff --git a/gcc/common/config/pru/pru-common.c
> > b/gcc/common/config/pru/pru-common.c new file mode 100644
> > index 00000000000..e87d70ce9ca
> > --- /dev/null
> > +++ b/gcc/common/config/pru/pru-common.c
> 
> [ ... ]
> 
> > @@ -0,0 +1,36 @@
> > +
> > +#undef TARGET_EXCEPT_UNWIND_INFO
> > +#define TARGET_EXCEPT_UNWIND_INFO sjlj_except_unwind_info
> 
> SJLJ exceptions?  Is there some reason you're not using dwarf2 unwind
> info?  It may mean adding some notes in the prologue generation code,
> but otherwise I'd expect it to "just work".
I would like to avoid increasing memory footprint. I saw that AVR folks use 
it, too. My understanding is that eh_frame data will have to be included for 
each function when using dwarf2 unwind.

For reference, typical data memory size for PRU is 4kb.
> 
> > +
> > +struct gcc_targetm_common targetm_common = TARGETM_COMMON_INITIALIZER;
> > diff --git a/gcc/config.gcc b/gcc/config.gcc
> > index 8b2fd908c38..d1cddb86c36 100644
> > --- a/gcc/config.gcc
> > +++ b/gcc/config.gcc
> 
> [ ... ]
> 
> > +pru*-*-*)
> > +	tm_file="dbxelf.h elfos.h newlib-stdint.h ${tm_file}"
> > +	tmake_file="${tmake_file} pru/t-pru"
> > +	extra_objs="pru-pragma.o pru-passes.o"
> > +	use_gcc_stdint=wrap
> 
> Can you try to avoid dbxelf.h?  We're trying to get away from embedded
> stabs.  I'd like to avoid creating new ports that reference this stuff.
This was copy-pasta, sorry. Will remove for v2.

> 
> > diff --git a/gcc/config/pru/pru-ldst-multiple.ml
> > b/gcc/config/pru/pru-ldst-multiple.ml new file mode 100644
> > index 00000000000..961a9fb33e6
> > --- /dev/null
> > +++ b/gcc/config/pru/pru-ldst-multiple.ml
> > @@ -0,0 +1,144 @@
> > +(* Auto-generate PRU load/store-multiple patterns
> > +   Copyright (C) 2014-2018 Free Software Foundation, Inc.
> > +   Based on arm-ldmstm.ml
> 
> [ ... ]
> So please make sure to include the generated file in the repository.  We
> don't want to force developers to have to have ocaml installed to build
> the port.  I believe this is consistent with how the ARM port works.
Will send it with the v2 patch set.

> 
> > diff --git a/gcc/config/pru/pru-opts.h b/gcc/config/pru/pru-opts.h
> > new file mode 100644
> > index 00000000000..1c1514cb2a3
> > --- /dev/null
> > +++ b/gcc/config/pru/pru-opts.h
> 
> [ ... ]
> 
> > +
> > +/* Definitions for option handling for PRU.  */
> > +
> > +#ifndef GCC_PRU_OPTS_H
> > +#define GCC_PRU_OPTS_H
> > +
> > +/* ABI variant for code generation.  */
> > +enum pru_abi {
> > +    PRU_ABI_GNU,
> > +    PRU_ABI_TI
> 
> Is there really any value in having two ABIs?  If there's another
> toolchain out there it'd be best if we were just compatible with that
> rather than defining a GNU ABI.  I guess it might be helpful if the TI
> ABI is crippled enough that it's hard to run the testsuite..

TI ABI defines data pointers as 32-bit and function pointers as 16-bit. I 
could not implement this reliably for the PRU GCC port. And it doesn't seem to 
be a good idea anyway:
  http://gcc.gnu.org/ml/gcc/2012-04/msg00870.html

My GCC implementation for TI ABI does not support function pointers and large 
return values. Hence I defined GNU ABI where those are supported, users can 
enjoy full C language capabilities, and a decent testsuite coverage is 
achieved.

> 
> > diff --git a/gcc/config/pru/pru.c b/gcc/config/pru/pru.c
> > new file mode 100644
> > index 00000000000..41b0a283d44
> > --- /dev/null
> > +++ b/gcc/config/pru/pru.c
> 
> f
> 
> > +
> > +  /* Unwind tables currently require a frame pointer for correctness,
> > +     see toplev.c:process_options().  */
> > +
> > +  if ((flag_unwind_tables
> > +       || flag_non_call_exceptions
> > +       || flag_asynchronous_unwind_tables)
> > +      && !ACCUMULATE_OUTGOING_ARGS)
> > +    {
> > +      flag_omit_frame_pointer = 0;
> > +    }
> 
> !?!  The comment doesn't make any sense.  Dwarf2 unwinders can certainly
> handle this case.  What specific comment in process_options are you
> referring to?
A while ago I copied this snippet from AVR port in order to fix a testsuite 
case failure. I'll recheck and either remove it, or update the comment.
> 
> > +
> > +/* Say that the epilogue uses the return address register.  Note that
> > +   in the case of sibcalls, the values "used by the epilogue" are
> > +   considered live at the start of the called function.  */
> > +#define EPILOGUE_USES(REGNO) (epilogue_completed &&	      \
> > +			      (((REGNO) == RA_REGNO)	      \
> > +			       || (REGNO) == (RA_REGNO + 1)))
> 
> Is this really needed?  If you properly describe the dataflow for your
> epilogues I don't think you need this hack.
Yes it is needed. I saw that nios2 port uses it, and I had no idea it is a 
hack.

I'll need some time to research how to fix it. Pointers or suggestions for a 
properly designed GCC backend would be appreciated.

> 
> > diff --git a/gcc/config/pru/pru.md b/gcc/config/pru/pru.md
> > new file mode 100644
> > index 00000000000..328fb484847
> > --- /dev/null
> > +++ b/gcc/config/pru/pru.md
> > @@ -0,0 +1,905 @@
> > +
> > +; Combine phase tends to produce this expression when given the following
> > +; expression: "uint dst = (ushort) op0 & (uint) op1;
> > +;
> > +; TODO - fix properly! Understand whether we need to fix combine core,
> > augment +; our hooks, or PRU is giving incorrect costs for
> > subexpressions. +(define_insn "*and_combine_zero_extend_workaround<mode>"
> > + [(set (match_operand:SI 0 "register_operand"       "=r")
> > +       (zero_extend:SI
> > +	(subreg:EQD
> > +	  (and:SI
> > +	    (match_operand:SI 1 "register_operand"  "%r")
> > +	    (match_operand:SI 2 "register_operand"  "rI"))
> > +	  0)))]
> > + ""
> > + "and\\t%0, %1, %F2<EQD:regwidth>"
> > + [(set_attr "type" "alu")])
> 
> You might want to revisit the need for this -- we went through a round
> of fixes during the last couple release cycles to clean this stuff up.
> Note I would have said something about this regardless of your TODO note
> because of the subreg expression (they're certainly *valid* in machine
> descriptions, but often a symptom of a problem elsewhere).  This was the
> only subreg expression I saw in your md file, which is good :-)
Sure, I'll retest.

> 
> 
> So the biggest issue is the desire to have more than 30 operands on some
> insns.  Can you describe in greater detail why that's beneficial.
I answered in the thread with change itself.

Thanks,
Dimitar
Jeff Law June 28, 2018, 2:07 a.m. | #5
On 06/23/2018 09:42 AM, Dimitar Dimitrov wrote:

>>
>> I've already asked about your copyright assignment status.  So you know,
>> we can't go forward with any nontrivial contributions until the
>> assignment is in place.
> Yes, FSF has my assignment since November 2016.
Great.  Thanks.




>> So it looks like you're supporting libgloss/newlib.  Does it work with
>> the current libgloss/newlib trunk?  I've had troubles over the last few
>> months with 16 bit targets.
> I have not detected issues with top-of-tree. But keep in mind that PRU is 8-
> bit only for the GCC internal representation. The port declares efficient 32-
> bit ops for SImode. From newlib's point of view, PRU is native 32-bit.
The issues are with targets that have 16 bit ints.  At least that's
that's my recollection.  They're not something I worry about too deeply
anymore, so I didn't dig into the details of what broke within newlib.

It sounds like you're lucky in that what you're exposing avoids these
recent breakages in newlib.

> 
> Some history here: http://gcc.gnu.org/ml/gcc/2017-01/msg00217.html
Thanks.  As Nathan mentioned, PRU is a little extreme in its register
handling.  There's always a tradeoff for what level of granularity to
expose on these oddball architectures.

Typically we have the register size match a natural word on the given
architecture and we don't try to allocate the sub-register objects
independently (ie, ah/al on x86 or something like the left/right
registers on the h8).

Given what little I know about the PRU, the actual operations on the
processor are 32 bits wide.  So I probably wouldn't have bothered
exposing byte level operations, except for byte loads/stores and
zero/sign extension from byte to 32bit words.  SImilarly for half-words.

Furthermore I would have restricted the byte and half-word operations to
the lowest covering sub-object.

But that'd just be my choice.  You're obviously allowed to do something
different.

>>
>>> diff --git a/gcc/common/config/pru/pru-common.c
>>> b/gcc/common/config/pru/pru-common.c new file mode 100644
>>> index 00000000000..e87d70ce9ca
>>> --- /dev/null
>>> +++ b/gcc/common/config/pru/pru-common.c
>>
>> [ ... ]
>>
>>> @@ -0,0 +1,36 @@
>>> +
>>> +#undef TARGET_EXCEPT_UNWIND_INFO
>>> +#define TARGET_EXCEPT_UNWIND_INFO sjlj_except_unwind_info
>>
>> SJLJ exceptions?  Is there some reason you're not using dwarf2 unwind
>> info?  It may mean adding some notes in the prologue generation code,
>> but otherwise I'd expect it to "just work".
> I would like to avoid increasing memory footprint. I saw that AVR folks use 
> it, too. My understanding is that eh_frame data will have to be included for 
> each function when using dwarf2 unwind.
> 
> For reference, typical data memory size for PRU is 4kb.
Understood.  Of course one could argue that trying to do C++ in a 4kb
footprint is on the absurd side. :-)


>>> +
>>> +/* ABI variant for code generation.  */
>>> +enum pru_abi {
>>> +    PRU_ABI_GNU,
>>> +    PRU_ABI_TI
>>
>> Is there really any value in having two ABIs?  If there's another
>> toolchain out there it'd be best if we were just compatible with that
>> rather than defining a GNU ABI.  I guess it might be helpful if the TI
>> ABI is crippled enough that it's hard to run the testsuite..
> 
> TI ABI defines data pointers as 32-bit and function pointers as 16-bit. I 
> could not implement this reliably for the PRU GCC port. And it doesn't seem to 
> be a good idea anyway:
>   http://gcc.gnu.org/ml/gcc/2012-04/msg00870.html
Ah.  Yea, we've never supported multiple pointer sizes like that.
While we do have special data segments on some architectures, that's
more about the ability to use short offsets from specific base pointers
rather than different sized pointers.

More recently we have added some limited address space support, but
again I don't think it's really a good match for this problem.


> 
> My GCC implementation for TI ABI does not support function pointers and large 
> return values. Hence I defined GNU ABI where those are supported, users can 
> enjoy full C language capabilities, and a decent testsuite coverage is 
> achieved.
ACK.  Thanks for explaining.

>>
>>> +
>>> +/* Say that the epilogue uses the return address register.  Note that
>>> +   in the case of sibcalls, the values "used by the epilogue" are
>>> +   considered live at the start of the called function.  */
>>> +#define EPILOGUE_USES(REGNO) (epilogue_completed &&	      \
>>> +			      (((REGNO) == RA_REGNO)	      \
>>> +			       || (REGNO) == (RA_REGNO + 1)))
>>
>> Is this really needed?  If you properly describe the dataflow for your
>> epilogues I don't think you need this hack.
> Yes it is needed. I saw that nios2 port uses it, and I had no idea it is a 
> hack.
It looks like most ports are using EPILOGUE_USES these days by way of
the df infrastructure which went in a few years ago.  So I probably
shouldn't call it a hack anymore.  Don't worry about trying to eliminate
it.  Sorry for steering you the wrong direction on this.

Jeff

Patch

diff --git a/configure.ac b/configure.ac
index 28155a0e593..684a7f58515 100644
--- a/configure.ac
+++ b/configure.ac
@@ -640,6 +640,10 @@  case "${target}" in
   powerpc-*-aix* | rs6000-*-aix*)
     noconfigdirs="$noconfigdirs target-libssp"
     ;;
+  pru-*-*)
+    # No hosted I/O support.
+    noconfigdirs="$noconfigdirs target-libssp"
+    ;;
   rl78-*-*)
     # libssp uses a misaligned load to trigger a fault, but the RL78
     # doesn't fault for those - instead, it gives a build-time error
@@ -824,6 +828,9 @@  case "${target}" in
   powerpc*-*-*)
     libgloss_dir=rs6000
     ;;
+  pru-*-*)
+    libgloss_dir=pru
+    ;;
   sparc*-*-*)
     libgloss_dir=sparc
     ;;
diff --git a/gcc/common/config/pru/pru-common.c b/gcc/common/config/pru/pru-common.c
new file mode 100644
index 00000000000..e87d70ce9ca
--- /dev/null
+++ b/gcc/common/config/pru/pru-common.c
@@ -0,0 +1,36 @@ 
+/* Common hooks for TI PRU
+   Copyright (C) 2014-2018 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "diagnostic-core.h"
+#include "tm.h"
+#include "common/common-target.h"
+#include "common/common-target-def.h"
+#include "opts.h"
+#include "flags.h"
+
+#undef TARGET_DEFAULT_TARGET_FLAGS
+#define TARGET_DEFAULT_TARGET_FLAGS		(MASK_OPT_LOOP)
+
+#undef TARGET_EXCEPT_UNWIND_INFO
+#define TARGET_EXCEPT_UNWIND_INFO sjlj_except_unwind_info
+
+struct gcc_targetm_common targetm_common = TARGETM_COMMON_INITIALIZER;
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 8b2fd908c38..d1cddb86c36 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -493,6 +493,9 @@  powerpc*-*-*)
 	esac
 	extra_options="${extra_options} g.opt fused-madd.opt rs6000/rs6000-tables.opt"
 	;;
+pru-*-*)
+	cpu_type=pru
+	;;
 riscv*)
 	cpu_type=riscv
 	extra_objs="riscv-builtins.o riscv-c.o"
@@ -2638,6 +2641,12 @@  powerpcle-*-eabi*)
 	extra_options="${extra_options} rs6000/sysv4.opt"
 	use_gcc_stdint=wrap
 	;;
+pru*-*-*)
+	tm_file="dbxelf.h elfos.h newlib-stdint.h ${tm_file}"
+	tmake_file="${tmake_file} pru/t-pru"
+	extra_objs="pru-pragma.o pru-passes.o"
+	use_gcc_stdint=wrap
+	;;
 rs6000-ibm-aix4.[3456789]* | powerpc-ibm-aix4.[3456789]*)
 	tm_file="rs6000/biarch64.h ${tm_file} rs6000/aix.h rs6000/aix43.h rs6000/xcoff.h rs6000/aix-stdint.h"
 	tmake_file="rs6000/t-aix43 t-slibgcc"
diff --git a/gcc/config/pru/alu-zext.md b/gcc/config/pru/alu-zext.md
new file mode 100644
index 00000000000..2112b08d3f4
--- /dev/null
+++ b/gcc/config/pru/alu-zext.md
@@ -0,0 +1,178 @@ 
+;; ALU operations with zero extensions
+;;
+;; Copyright (C) 2015-2018 Free Software Foundation, Inc.
+;; Contributed by Dimitar Dimitrov <dimitar@dinux.eu>
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_subst_attr "alu2_zext"     "alu2_zext_subst"     "_z" "_noz")
+
+(define_subst_attr "alu3_zext_op1" "alu3_zext_op1_subst" "_z1" "_noz1")
+(define_subst_attr "alu3_zext_op2" "alu3_zext_op2_subst" "_z2" "_noz2")
+(define_subst_attr "alu3_zext"     "alu3_zext_subst"     "_z" "_noz")
+
+(define_subst_attr "bitalu_zext"   "bitalu_zext_subst"   "_z" "_noz")
+
+(define_code_iterator ALUOP3 [plus minus and ior xor umin umax ashift lshiftrt])
+(define_code_iterator ALUOP2 [neg not])
+
+;; Arithmetic Operations
+
+(define_insn "add_impl<EQD:mode><EQS0:mode><EQS1:mode>_<alu3_zext><alu3_zext_op1><alu3_zext_op2>"
+  [(set (match_operand:EQD 0 "register_operand" "=r,r,r")
+	(plus:EQD
+	 (zero_extend:EQD
+	  (match_operand:EQS0 1 "register_operand" "%r,r,r"))
+	 (zero_extend:EQD
+	  (match_operand:EQS1 2 "nonmemory_operand" "r,I,M"))))]
+  ""
+  "@
+   add\\t%0, %1, %2
+   add\\t%0, %1, %2
+   sub\\t%0, %1, %n2"
+  [(set_attr "type" "alu")
+   (set_attr "length" "4")])
+
+(define_insn "sub_impl<EQD:mode><EQS0:mode><EQS1:mode>_<alu3_zext><alu3_zext_op1><alu3_zext_op2>"
+  [(set (match_operand:EQD 0 "register_operand" "=r,r,r")
+	(minus:EQD
+	 (zero_extend:EQD
+	  (match_operand:EQS0 1 "reg_or_ubyte_operand" "r,r,I"))
+	 (zero_extend:EQD
+	  (match_operand:EQS1 2 "reg_or_ubyte_operand" "r,I,r"))))]
+  ""
+  "@
+   sub\\t%0, %1, %2
+   sub\\t%0, %1, %2
+   rsb\\t%0, %2, %1"
+  [(set_attr "type" "alu")
+   (set_attr "length" "4")])
+
+
+(define_insn "neg_impl<EQD:mode><EQS0:mode>_<alu2_zext>"
+  [(set (match_operand:EQD 0 "register_operand" "=r")
+	(neg:EQD
+	  (zero_extend:EQD (match_operand:EQS0 1 "register_operand" "r"))))]
+  ""
+  "rsb\\t%0, %1, 0"
+  [(set_attr "type" "alu")
+   (set_attr "length" "4")])
+
+
+(define_insn "one_impl<EQD:mode><EQS0:mode>_<alu2_zext>"
+  [(set (match_operand:EQD 0 "register_operand" "=r")
+	(not:EQD
+	  (zero_extend:EQD (match_operand:EQS0 1 "register_operand" "r"))))]
+  ""
+  "not\\t%0, %1"
+  [(set_attr "type" "alu")
+   (set_attr "length" "4")])
+
+; Specialized IOR/AND patterns for matching setbit/clearbit instructions.
+;
+; TODO - allow clrbit and setbit to support (1 << REG) constructs
+
+(define_insn "clearbit_<EQD:mode><EQS0:mode>_<bitalu_zext>"
+  [(set (match_operand:EQD 0 "register_operand"	"=r")
+	(and:EQD
+	  (zero_extend:EQD
+	    (match_operand:EQS0 1 "register_operand" "r"))
+	  (match_operand:EQD 2 "single_zero_operand" "n")))]
+  ""
+  "clr\\t%0, %1, %V2"
+  [(set_attr "type" "alu")
+   (set_attr "length" "4")])
+
+(define_insn "setbit_<EQD:mode><EQS0:mode>_<bitalu_zext>"
+  [(set (match_operand:EQD 0 "register_operand" "=r")
+	(ior:EQD
+	  (zero_extend:EQD
+	    (match_operand:EQS0 1 "register_operand" "r"))
+	  (match_operand:EQD 2 "single_one_operand" "n")))]
+  ""
+  "set\\t%0, %1, %T2"
+  [(set_attr "type" "alu")
+   (set_attr "length" "4")])
+
+; Regular ALU ops
+(define_insn "<code>_impl<EQD:mode><EQS0:mode><EQS1:mode>_<alu3_zext><alu3_zext_op1><alu3_zext_op2>"
+  [(set (match_operand:EQD 0 "register_operand" "=r")
+	(LOGICAL:EQD
+	  (zero_extend:EQD
+	    (match_operand:EQS0 1 "register_operand"     "%r"))
+	  (zero_extend:EQD
+	    (match_operand:EQS1 2 "reg_or_ubyte_operand"  "rI"))))]
+  ""
+  "<logical_asm>\\t%0, %1, %2"
+  [(set_attr "type" "alu")
+   (set_attr "length" "4")])
+
+; Shift ALU ops
+(define_insn "<shift_op>_impl<EQD:mode><EQS0:mode><EQS1:mode>_<alu3_zext><alu3_zext_op1><alu3_zext_op2>"
+  [(set (match_operand:EQD 0 "register_operand" "=r")
+	(SHIFT:EQD
+	 (zero_extend:EQD (match_operand:EQS0 1 "register_operand" "r"))
+	 (zero_extend:EQD (match_operand:EQS1 2 "shift_operand"    "rL"))))]
+  ""
+  "<shift_asm>\\t%0, %1, %2"
+  [(set_attr "type" "alu")
+   (set_attr "length" "4")])
+
+;; Substitutions
+
+(define_subst "alu2_zext_subst"
+  [(set (match_operand:EQD 0 "" "")
+	(ALUOP2:EQD (zero_extend:EQD (match_operand:EQD 1 "" ""))))]
+  ""
+  [(set (match_dup 0)
+	(ALUOP2:EQD (match_dup 1)))])
+
+(define_subst "bitalu_zext_subst"
+  [(set (match_operand:EQD 0 "" "")
+	(ALUOP3:EQD (zero_extend:EQD (match_operand:EQD 1 "" ""))
+		    (match_operand:EQD 2 "" "")))]
+  ""
+  [(set (match_dup 0)
+	(ALUOP3:EQD (match_dup 1)
+		    (match_dup 2)))])
+
+(define_subst "alu3_zext_subst"
+  [(set (match_operand:EQD 0 "" "")
+	(ALUOP3:EQD (zero_extend:EQD (match_operand:EQD 1 "" ""))
+		    (zero_extend:EQD (match_operand:EQD 2 "" ""))))]
+  ""
+  [(set (match_dup 0)
+	(ALUOP3:EQD (match_dup 1)
+		    (match_dup 2)))])
+
+(define_subst "alu3_zext_op1_subst"
+  [(set (match_operand:EQD 0 "" "")
+	(ALUOP3:EQD (zero_extend:EQD (match_operand:EQD 1 "" ""))
+		    (zero_extend:EQD (match_operand:EQS1 2 "" ""))))]
+  ""
+  [(set (match_dup 0)
+	(ALUOP3:EQD (match_dup 1)
+		    (zero_extend:EQD (match_dup 2))))])
+
+(define_subst "alu3_zext_op2_subst"
+  [(set (match_operand:EQD 0 "" "")
+	(ALUOP3:EQD (zero_extend:EQD (match_operand:EQS0 1 "" ""))
+		    (zero_extend:EQD (match_operand:EQD 2 "" ""))))]
+  ""
+  [(set (match_dup 0)
+	(ALUOP3:EQD (zero_extend:EQD (match_dup 1))
+		    (match_dup 2)))])
diff --git a/gcc/config/pru/constraints.md b/gcc/config/pru/constraints.md
new file mode 100644
index 00000000000..326c8e9907c
--- /dev/null
+++ b/gcc/config/pru/constraints.md
@@ -0,0 +1,88 @@ 
+;; Constraint definitions for TI PRU.
+;; Copyright (C) 2014-2018 Free Software Foundation, Inc.
+;; Contributed by Dimitar Dimitrov <dimitar@dinux.eu>
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+;; We use the following constraint letters for constants:
+;;
+;;  I: 0 to 255.
+;;  J: 0 to 65535.
+;;  L: 0 to 31 (for shift counts).
+;;  M: -255 to 0 (for converting ADD to SUB with suitable UBYTE OP2).
+;;  N: -32768 to 32767 (16-bit signed integer)
+;;  P: 1
+;;  T: Text segment label.  Needed to know when to select %pmem relocation.
+;;  Z: Constant integer zero.
+;;
+;; We use the following built-in register classes:
+;;
+;;  r: general purpose register (r0..r31)
+;;  m: memory operand
+;;
+;; The following constraints are intended for internal use only:
+;;  j: jump address register suitable for sibling calls
+;;  l: the internal counter register used by LOOP instructions
+
+;; Register constraints.
+
+(define_register_constraint "j" "SIB_REGS"
+  "A register suitable for an indirect sibcall.")
+
+(define_register_constraint "l" "LOOPCNTR_REGS"
+  "The internal counter register used by the LOOP instruction.")
+
+;; Integer constraints.
+
+(define_constraint "I"
+  "An unsigned 8-bit constant."
+  (and (match_code "const_int")
+       (match_test "UBYTE_INT (ival)")))
+
+(define_constraint "J"
+  "An unsigned 16-bit constant."
+  (and (match_code "const_int")
+       (match_test "UHWORD_INT (ival)")))
+
+(define_constraint "L"
+  "An unsigned 5-bit constant (for shift counts)."
+  (and (match_code "const_int")
+       (match_test "ival >= 0 && ival <= 31")))
+
+(define_constraint "M"
+  "A constant in the range [-255;0]."
+  (and (match_code "const_int")
+       (match_test "UBYTE_INT (-ival)")))
+
+(define_constraint "N"
+  "A constant in the range [-32768;32767]."
+  (and (match_code "const_int")
+       (match_test "SHWORD_INT (ival)")))
+
+(define_constraint "P"
+  "A constant 1."
+  (and (match_code "const_int")
+       (match_test "ival == 1")))
+
+(define_constraint "T"
+  "A text segment (program memory) constant label."
+  (match_test "text_segment_operand (op, VOIDmode)"))
+
+(define_constraint "Z"
+  "An integer constant zero."
+  (and (match_code "const_int")
+       (match_test "ival == 0")))
diff --git a/gcc/config/pru/predicates.md b/gcc/config/pru/predicates.md
new file mode 100644
index 00000000000..0015220c022
--- /dev/null
+++ b/gcc/config/pru/predicates.md
@@ -0,0 +1,220 @@ 
+;; Predicate definitions for TI PRU.
+;; Copyright (C) 2014-2018 Free Software Foundation, Inc.
+;; Contributed by Dimitar Dimitrov <dimitar@dinux.eu>
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_predicate "const_1_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 1, 1)")))
+
+(define_predicate "const_ubyte_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 0xff)")))
+
+(define_predicate "const_uhword_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 0xffff)")))
+
+; TRUE for comparisons we support.
+(define_predicate "pru_cmp_operator"
+  (match_code "eq,ne,leu,ltu,geu,gtu"))
+
+; TRUE for signed comparisons that need special handling for PRU.
+(define_predicate "pru_signed_cmp_operator"
+  (match_code "ge,gt,le,lt"))
+
+;; FP Comparisons handled by pru_expand_pru_compare.
+(define_predicate "pru_fp_comparison_operator"
+  (match_code "eq,lt,gt,le,ge"))
+
+;; Return true if OP is a constant that contains only one 1 in its
+;; binary representation.
+(define_predicate "single_one_operand"
+  (and (match_code "const_int")
+       (match_test "exact_log2 (INTVAL (op) & GET_MODE_MASK (mode)) >= 0")))
+
+;; Return true if OP is a constant that contains only one 0 in its
+;; binary representation.
+(define_predicate "single_zero_operand"
+  (and (match_code "const_int")
+       (match_test "exact_log2 (~INTVAL (op) & GET_MODE_MASK (mode)) >= 0")))
+
+
+(define_predicate "reg_or_ubyte_operand"
+  (ior (match_operand 0 "const_ubyte_operand")
+       (match_operand 0 "register_operand")))
+
+(define_predicate "reg_or_const_1_operand"
+  (ior (match_operand 0 "const_1_operand")
+       (match_operand 0 "register_operand")))
+
+(define_predicate "const_shift_operand"
+  (and (match_code "const_int")
+       (match_test "SHIFT_INT (INTVAL (op))")))
+
+(define_predicate "shift_operand"
+  (ior (match_operand 0 "const_shift_operand")
+       (match_operand 0 "register_operand")))
+
+(define_predicate "ctable_addr_operand"
+  (and (match_code "const_int")
+       (match_test ("(pru_get_ctable_base_index (INTVAL (op)) >= 0)"))))
+
+(define_predicate "ctable_base_operand"
+  (and (match_code "const_int")
+       (match_test ("(pru_get_ctable_exact_base_index (INTVAL (op)) >= 0)"))))
+
+;; Ideally we should enforce a restriction to all text labels to fit in
+;; 16bits, as required by the PRU ISA.  But for the time being we'll rely on
+;; binutils to catch text segment overflows.
+(define_predicate "call_operand"
+  (ior (match_operand 0 "immediate_operand")
+       (match_operand 0 "register_operand")))
+
+;; Return true if OP is a text segment reference.
+;; This is needed for program memory address expressions.  Borrowed from AVR.
+(define_predicate "text_segment_operand"
+  (match_code "code_label,label_ref,symbol_ref,plus,minus,const")
+{
+  switch (GET_CODE (op))
+    {
+    case CODE_LABEL:
+      return true;
+    case LABEL_REF :
+      return true;
+    case SYMBOL_REF :
+      return SYMBOL_REF_FUNCTION_P (op);
+    case PLUS :
+    case MINUS :
+      /* Assume canonical format of symbol + constant.
+	 Fall through.  */
+    case CONST :
+      return text_segment_operand (XEXP (op, 0), VOIDmode);
+    default :
+      return false;
+    }
+})
+
+;; Return 1 if OP is a load multiple operation.  It is known to be a
+;; PARALLEL and the first section will be tested.
+
+(define_predicate "pru_load_multiple_operation"
+  (match_code "parallel")
+{
+  int count = XVECLEN (op, 0);
+  int base_offset = 0;
+  int dest_regno;
+  rtx src_addr;
+  int i;
+
+  /* Perform a quick check so we don't blow up below.  */
+  if (count < 1
+      || GET_CODE (XVECEXP (op, 0, 0)) != SET
+      || GET_CODE (SET_DEST (XVECEXP (op, 0, 0))) != REG
+      || GET_CODE (SET_SRC (XVECEXP (op, 0, 0))) != MEM)
+    return 0;
+
+  dest_regno = REGNO (SET_DEST (XVECEXP (op, 0, 0)));
+  src_addr = XEXP (SET_SRC (XVECEXP (op, 0, 0)), 0);
+
+  /* Zero memory offset will be optimized, so handle both PLUS and REG here.
+     Note that subsequent registers in the sequence will always have
+     non-zero offsets because we can't have negative offsets, and
+     we always increment while going forward.  */
+  if (GET_CODE (src_addr) == PLUS
+      && GET_CODE (XEXP (src_addr, 1)) == CONST_INT)
+    {
+      base_offset = INTVAL (XEXP (src_addr, 1));
+      src_addr = XEXP (src_addr, 0);
+    }
+
+  if (GET_CODE (src_addr) != REG)
+    return 0;
+
+  for (i = 1; i < count; i++)
+    {
+      rtx elt = XVECEXP (op, 0, i);
+      int offset = i * UNITS_PER_WORD + base_offset;
+
+      if (GET_CODE (elt) != SET
+	  || GET_CODE (SET_DEST (elt)) != REG
+	  || GET_MODE (SET_DEST (elt)) != QImode
+	  || REGNO (SET_DEST (elt))    != (unsigned) (dest_regno + i)
+	  || GET_CODE (SET_SRC (elt))  != MEM
+	  || GET_MODE (SET_SRC (elt))  != QImode
+	  || GET_CODE (XEXP (SET_SRC (elt), 0)) != PLUS
+	  || ! rtx_equal_p (XEXP (XEXP (SET_SRC (elt), 0), 0), src_addr)
+	  || GET_CODE (XEXP (XEXP (SET_SRC (elt), 0), 1)) != CONST_INT
+	  || INTVAL (XEXP (XEXP (SET_SRC (elt), 0), 1)) != offset)
+	return 0;
+    }
+
+  return 1;
+})
+
+;; Similar, but tests for store multiple.
+
+(define_predicate "pru_store_multiple_operation"
+  (match_code "parallel")
+{
+  int count = XVECLEN (op, 0);
+  int base_offset = 0;
+  int src_regno;
+  rtx dest_addr;
+  int i;
+
+  /* Perform a quick check so we don't blow up below.  */
+  if (count < 1
+      || GET_CODE (XVECEXP (op, 0, 0)) != SET
+      || GET_CODE (SET_DEST (XVECEXP (op, 0, 0))) != MEM
+      || GET_CODE (SET_SRC (XVECEXP (op, 0, 0))) != REG)
+    return 0;
+
+  src_regno = REGNO (SET_SRC (XVECEXP (op, 0, 0)));
+  dest_addr = XEXP (SET_DEST (XVECEXP (op, 0, 0)), 0);
+
+  if (GET_CODE (dest_addr) == PLUS
+      && GET_CODE (XEXP (dest_addr, 1)) == CONST_INT)
+    {
+      base_offset = INTVAL (XEXP (dest_addr, 1));
+      dest_addr = XEXP (dest_addr, 0);
+    }
+
+  if (GET_CODE (dest_addr) != REG)
+    return 0;
+
+  for (i = 1; i < count; i++)
+    {
+      rtx elt = XVECEXP (op, 0, i);
+      int offset = i * UNITS_PER_WORD + base_offset;
+
+      if (GET_CODE (elt) != SET
+	  || GET_CODE (SET_SRC (elt)) != REG
+	  || GET_MODE (SET_SRC (elt)) != QImode
+	  || REGNO (SET_SRC (elt)) != (unsigned) (src_regno + i)
+	  || GET_CODE (SET_DEST (elt)) != MEM
+	  || GET_MODE (SET_DEST (elt)) != QImode
+	  || GET_CODE (XEXP (SET_DEST (elt), 0)) != PLUS
+	  || ! rtx_equal_p (XEXP (XEXP (SET_DEST (elt), 0), 0), dest_addr)
+	  || GET_CODE (XEXP (XEXP (SET_DEST (elt), 0), 1)) != CONST_INT
+	  || INTVAL (XEXP (XEXP (SET_DEST (elt), 0), 1)) != offset)
+	return 0;
+    }
+
+  return 1;
+})
diff --git a/gcc/config/pru/pru-ldst-multiple.ml b/gcc/config/pru/pru-ldst-multiple.ml
new file mode 100644
index 00000000000..961a9fb33e6
--- /dev/null
+++ b/gcc/config/pru/pru-ldst-multiple.ml
@@ -0,0 +1,144 @@ 
+(* Auto-generate PRU load/store-multiple patterns
+   Copyright (C) 2014-2018 Free Software Foundation, Inc.
+   Based on arm-ldmstm.ml
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it under
+   the terms of the GNU General Public License as published by the Free
+   Software Foundation; either version 3, or (at your option) any later
+   version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or
+   FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+   for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.
+
+   This is an O'Caml program.  The O'Caml compiler is available from:
+   http://caml.inria.fr/
+
+   Or from your favourite OS's friendly packaging system.  Tested with version
+   4.01.0, though other versions will probably work too.
+
+   Run with:
+     ocaml pru-ldst-multiple.ml >/path/to/gcc/config/pru/ldst-multiple.md
+*)
+
+let write_addr_reg nregs first =
+  if not first then
+    Printf.sprintf "(match_dup %d)" (nregs + 1)
+  else
+    Printf.sprintf ("(match_operand:SI %d \"register_operand\" \"r\")")
+      (nregs + 1)
+
+let write_ldm_set nregs opnr first do_first_offs =
+  let indent = "     " in
+  Printf.printf "%s" (if first then "    [" else indent);
+  Printf.printf "(set (match_operand:QI %d \"register_operand\" \"\")\n" opnr;
+  Printf.printf "%s     (mem:QI " indent;
+  if (first && not do_first_offs) then begin
+    Printf.printf "%s" (write_addr_reg nregs first);
+  end else begin
+    Printf.printf "(plus:SI %s" (write_addr_reg nregs first);
+    Printf.printf "\n%s                      (match_operand:SI %d "
+      indent
+      (if do_first_offs then (nregs + 1 + opnr) else (nregs + opnr));
+    if first then
+      Printf.printf "\"const_ubyte_operand\" \"I\"))"
+    else
+      Printf.printf "\"const_int_operand\" \"i\"))"
+  end;
+  Printf.printf "))"
+
+let write_stm_set nregs opnr first do_first_offs =
+  let indent = "     " in
+  Printf.printf "%s" (if first then "    [" else indent);
+  Printf.printf "(set (mem:QI ";
+  if (first && not do_first_offs) then begin
+    Printf.printf "%s" (write_addr_reg nregs first);
+  end else begin
+    Printf.printf "(plus:SI ";
+    Printf.printf "%s" (write_addr_reg nregs first);
+    Printf.printf "\n%s                      (match_operand:SI %d "
+      indent
+      (if do_first_offs then (nregs + 1 + opnr) else (nregs + opnr));
+    begin if first then
+      Printf.printf "\"const_ubyte_operand\" \"I\"))"
+    else
+      Printf.printf "\"const_int_operand\" \"i\"))"
+    end;
+  end;
+  Printf.printf ")\n%s     (match_operand:QI %d \"register_operand\" \"\"))" indent opnr
+
+let rec write_pat_sets func opnr first n_left do_first_offs =
+  func opnr first do_first_offs;
+  begin
+    if n_left > 1 then begin
+      Printf.printf "\n";
+      write_pat_sets func (opnr + 1) false (n_left - 1) do_first_offs;
+    end else
+      Printf.printf "]"
+  end
+
+exception InvalidAddrMode of string;;
+
+let write_pattern_1 name ls nregs do_first_offs write_set_fn =
+  Printf.printf "(define_insn \"*%s_multiple_%d%s\"\n" name nregs
+    (if do_first_offs then "_offs" else "");
+  Printf.printf "  [(match_parallel 0 \"pru_%s_multiple_operation\"\n" ls;
+  write_pat_sets (write_set_fn nregs) 1 true nregs do_first_offs;
+  Printf.printf ")]\n  \"XVECLEN (operands[0], 0) == %d\"\n" nregs;
+  if do_first_offs then begin
+    Printf.printf "  \"%s\\t%%b1, %%F%d, %%%d, %d\"\n" name (nregs + 1) (nregs + 2) (nregs * 1);
+  end else begin
+    Printf.printf "  \"%s\\t%%b1, %%F%d, 0, %d\"\n" name (nregs + 1) (nregs * 1);
+  end;
+  Printf.printf "  [(set_attr \"type\" \"ld\")\n";
+  Printf.printf "   (set_attr \"length\" \"4\")";
+  Printf.printf "])\n\n"
+
+let patterns () =
+  (* The rule for load-multiple of a single register greatly simplifies
+     the prologue/epilogue code.  *)
+  for nregs = 1 to 29; do
+    write_pattern_1 "lbbo" "load" nregs true write_ldm_set;
+    write_pattern_1 "lbbo" "load" nregs false write_ldm_set;
+    write_pattern_1 "sbbo" "store" nregs true write_stm_set;
+    write_pattern_1 "sbbo" "store" nregs false write_stm_set;
+  done
+
+let print_lines = List.iter (fun s -> Format.printf "%s@\n" s)
+
+(* Do it.  *)
+
+let _ =
+  print_lines [
+"/* PRU load/store instruction patterns.  This file was automatically";
+"   generated using pru-ldst-multiple.ml.  Please do not edit manually.";
+"";
+"   Copyright (C) 2018 Free Software Foundation, Inc.";
+"   Contributed by Dimitar Dimitrov <dimitar@dinux.eu>.";
+"   Based on arm-ldmstm.ml";
+"";
+"   This file is part of GCC.";
+"";
+"   GCC is free software; you can redistribute it and/or modify it";
+"   under the terms of the GNU General Public License as published";
+"   by the Free Software Foundation; either version 3, or (at your";
+"   option) any later version.";
+"";
+"   GCC is distributed in the hope that it will be useful, but WITHOUT";
+"   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY";
+"   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public";
+"   License for more details.";
+"";
+"   You should have received a copy of the GNU General Public License and";
+"   a copy of the GCC Runtime Library Exception along with this program;";
+"   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see";
+"   <http://www.gnu.org/licenses/>.  */";
+""];
+  patterns ();
diff --git a/gcc/config/pru/pru-opts.h b/gcc/config/pru/pru-opts.h
new file mode 100644
index 00000000000..1c1514cb2a3
--- /dev/null
+++ b/gcc/config/pru/pru-opts.h
@@ -0,0 +1,31 @@ 
+/* Copyright (C) 2017-2018 Free Software Foundation, Inc.
+   Contributed by Dimitar Dimitrov <dimitar@dinux.eu>
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+/* Definitions for option handling for PRU.  */
+
+#ifndef GCC_PRU_OPTS_H
+#define GCC_PRU_OPTS_H
+
+/* ABI variant for code generation.  */
+enum pru_abi {
+    PRU_ABI_GNU,
+    PRU_ABI_TI
+};
+
+#endif
diff --git a/gcc/config/pru/pru-passes.c b/gcc/config/pru/pru-passes.c
new file mode 100644
index 00000000000..be327299672
--- /dev/null
+++ b/gcc/config/pru/pru-passes.c
@@ -0,0 +1,230 @@ 
+/* PRU target specific passes
+   Copyright (C) 2017-2018 Free Software Foundation, Inc.
+   Dimitar Dimitrov <dimitar@dinux.eu>
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+#define IN_TARGET_CODE 1
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "context.h"
+#include "tm.h"
+#include "alias.h"
+#include "symtab.h"
+#include "tree.h"
+#include "diagnostic-core.h"
+#include "function.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-walk.h"
+#include "gimple-expr.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+
+#include "pru-protos.h"
+
+namespace {
+
+/* Scan the tree to ensure that the compiled code by GCC
+   conforms to the TI ABI specification.  If GCC cannot
+   output a conforming code, raise an error.  */
+const pass_data pass_data_tiabi_check =
+{
+  GIMPLE_PASS, /* type */
+  "*tiabi_check", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_NONE, /* tv_id */
+  PROP_gimple_any, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  0, /* todo_flags_finish */
+};
+
+class pass_tiabi_check : public gimple_opt_pass
+{
+public:
+  pass_tiabi_check (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_tiabi_check, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  virtual unsigned int execute (function *);
+
+  virtual bool gate (function *fun ATTRIBUTE_UNUSED)
+  {
+    return pru_current_abi == PRU_ABI_TI;
+  }
+
+}; // class pass_tiabi_check
+
+/* Return 1 if type TYPE is a pointer to function type or a
+   structure having a pointer to function type as one of its fields.
+   Otherwise return 0.  */
+static bool
+chkp_type_has_function_pointer (const_tree type)
+{
+  bool res = false;
+
+  if (POINTER_TYPE_P (type) && FUNC_OR_METHOD_TYPE_P (TREE_TYPE (type)))
+    res = true;
+  else if (RECORD_OR_UNION_TYPE_P (type))
+    {
+      tree field;
+
+      for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
+	if (TREE_CODE (field) == FIELD_DECL)
+	  res = res || chkp_type_has_function_pointer (TREE_TYPE (field));
+    }
+  else if (TREE_CODE (type) == ARRAY_TYPE)
+    res = chkp_type_has_function_pointer (TREE_TYPE (type));
+
+  return res;
+}
+
+/* Check the function declaration for TI ABI compatibility.  */
+static void
+chk_function_decl (const_tree fntype, location_t call_location)
+{
+  /* GCC does not check if the RETURN VALUE pointer is NULL,
+     so do not allow GCC functions with large return values.  */
+  if (!VOID_TYPE_P (TREE_TYPE (fntype))
+      && pru_return_in_memory (TREE_TYPE (fntype), fntype))
+    error_at (call_location,
+	      "large return values not supported with -mabi=ti option");
+
+  /* Check this function's arguments.  */
+  for (tree p = TYPE_ARG_TYPES (fntype); p; p = TREE_CHAIN (p))
+    {
+      tree arg_type = TREE_VALUE (p);
+      if (chkp_type_has_function_pointer (arg_type))
+	{
+	  error_at (call_location,
+		    "function pointers not supported with -mabi=ti option");
+	}
+    }
+}
+
+static tree
+check_op_callback (tree *tp, int *walk_subtrees, void *data)
+{
+  struct walk_stmt_info *wi = (struct walk_stmt_info *) data;
+
+  if (RECORD_OR_UNION_TYPE_P (*tp) || TREE_CODE (*tp) == ENUMERAL_TYPE)
+    {
+      /* Forward declarations have NULL tree type.  Skip them.  */
+      if (TREE_TYPE (*tp) == NULL)
+	return NULL;
+    }
+
+  /* TODO - why C++ leaves INTEGER_TYPE forward declarations around?  */
+  if (TREE_TYPE (*tp) == NULL)
+    return NULL;
+
+  const tree type = TREE_TYPE (*tp);
+
+  /* Direct function calls are allowed, obviously.  */
+  if (TREE_CODE (*tp) == ADDR_EXPR && TREE_CODE (type) == POINTER_TYPE)
+    {
+      const tree ptype = TREE_TYPE (type);
+      if (TREE_CODE (ptype) == FUNCTION_TYPE)
+	return NULL;
+    }
+
+  switch (TREE_CODE (type))
+    {
+    case FUNCTION_TYPE:
+    case METHOD_TYPE:
+	{
+	  /* Note: Do not enforce a small return value.  It is safe to
+	     call any TI ABI function from GCC, since GCC will
+	     never pass NULL.  */
+
+	  /* Check arguments for function pointers.  */
+	  for (tree p = TYPE_ARG_TYPES (type); p; p = TREE_CHAIN (p))
+	    {
+	      tree arg_type = TREE_VALUE (p);
+	      if (chkp_type_has_function_pointer (arg_type))
+		{
+		  error_at (gimple_location (wi->stmt), "function pointers "
+			    "not supported with -mabi=ti option");
+		}
+	    }
+	  break;
+	}
+    case RECORD_TYPE:
+    case UNION_TYPE:
+    case QUAL_UNION_TYPE:
+    case POINTER_TYPE:
+	{
+	  if (chkp_type_has_function_pointer (type))
+	    {
+	      error_at (gimple_location (wi->stmt),
+			"function pointers not supported with -mabi=ti option");
+	      *walk_subtrees = false;
+	    }
+	  break;
+	}
+    default:
+	  break;
+    }
+  return NULL;
+}
+
+unsigned
+pass_tiabi_check::execute (function *fun)
+{
+  struct walk_stmt_info wi;
+  const_tree fntype = TREE_TYPE (fun->decl);
+
+  gimple_seq body = gimple_body (current_function_decl);
+
+  memset (&wi, 0, sizeof (wi));
+  wi.info = NULL;
+  wi.want_locations = true;
+
+  /* Check the function body.  */
+  walk_gimple_seq (body, NULL, check_op_callback, &wi);
+
+  /* Check the function declaration.  */
+  chk_function_decl (fntype, fun->function_start_locus);
+
+  return 0;
+}
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_tiabi_check (gcc::context *ctxt)
+{
+  return new pass_tiabi_check (ctxt);
+}
+
+/* Register as early as possible.  */
+void
+pru_register_abicheck_pass (void)
+{
+  opt_pass *tiabi_check = make_pass_tiabi_check (g);
+  struct register_pass_info tiabi_check_info
+    = { tiabi_check, "*warn_unused_result",
+	1, PASS_POS_INSERT_AFTER
+      };
+  register_pass (&tiabi_check_info);
+}
diff --git a/gcc/config/pru/pru-pragma.c b/gcc/config/pru/pru-pragma.c
new file mode 100644
index 00000000000..f2f045b13a5
--- /dev/null
+++ b/gcc/config/pru/pru-pragma.c
@@ -0,0 +1,90 @@ 
+/* PRU target specific pragmas
+   Copyright (C) 2015-2018 Free Software Foundation, Inc.
+   Contributed by Dimitar Dimitrov <dimitar@dinux.eu>
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+#define IN_TARGET_CODE 1
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "alias.h"
+#include "symtab.h"
+#include "tree.h"
+#include "c-family/c-pragma.h"
+#include "c-family/c-common.h"
+#include "diagnostic-core.h"
+#include "cpplib.h"
+#include "pru-protos.h"
+
+
+/* Implements the "pragma CTABLE_ENTRY" pragma.  This pragma takes a
+   CTABLE index and an address, and instructs the compiler that
+   LBCO/SBCO can be used on that base address.
+
+   WARNING: Only immediate constant addresses are currently supported.  */
+static void
+pru_pragma_ctable_entry (cpp_reader * reader ATTRIBUTE_UNUSED)
+{
+  tree ctable_index, base_addr;
+  enum cpp_ttype type;
+
+  type = pragma_lex (&ctable_index);
+  if (type == CPP_NUMBER)
+    {
+      type = pragma_lex (&base_addr);
+      if (type == CPP_NUMBER)
+	{
+	  unsigned int i = tree_to_uhwi (ctable_index);
+	  unsigned HOST_WIDE_INT base = tree_to_uhwi (base_addr);
+
+	  type = pragma_lex (&base_addr);
+	  if (type != CPP_EOF)
+	    {
+	      error ("junk at end of #pragma CTABLE_ENTRY");
+	    }
+	  else if (i >= ARRAY_SIZE (pru_ctable))
+	    {
+	      error ("CTABLE_ENTRY index %d is not valid", i);
+	    }
+	  else if (pru_ctable[i].valid && pru_ctable[i].base != base)
+	    {
+	      error ("redefinition of CTABLE_ENTRY %d", i);
+	    }
+	  else
+	    {
+	      if (base & 0xff)
+		warning (0, "CTABLE_ENTRY base address is not "
+			    "a multiple of 256");
+	      pru_ctable[i].base = base;
+	      pru_ctable[i].valid = true;
+	    }
+	  return;
+	}
+    }
+  error ("malformed #pragma CTABLE_ENTRY variable address");
+}
+
+/* Implements REGISTER_TARGET_PRAGMAS.  */
+void
+pru_register_pragmas (void)
+{
+  c_register_pragma (NULL, "ctable_entry", pru_pragma_ctable_entry);
+  c_register_pragma (NULL, "CTABLE_ENTRY", pru_pragma_ctable_entry);
+}
diff --git a/gcc/config/pru/pru-protos.h b/gcc/config/pru/pru-protos.h
new file mode 100644
index 00000000000..a082ede7b1e
--- /dev/null
+++ b/gcc/config/pru/pru-protos.h
@@ -0,0 +1,70 @@ 
+/* Subroutine declarations for TI PRU target support.
+   Copyright (C) 2014-2018 Free Software Foundation, Inc.
+   Contributed by Dimitar Dimitrov <dimitar@dinux.eu>
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_PRU_PROTOS_H
+#define GCC_PRU_PROTOS_H
+
+struct pru_ctable_entry {
+    bool valid;
+    unsigned HOST_WIDE_INT base;
+};
+
+extern struct pru_ctable_entry pru_ctable[32];
+
+extern int pru_initial_elimination_offset (int, int);
+extern int pru_can_use_return_insn (void);
+extern void pru_expand_prologue (void);
+extern void pru_expand_epilogue (bool);
+extern void pru_function_profiler (FILE *, int);
+
+void pru_register_pragmas (void);
+
+#ifdef RTX_CODE
+extern rtx pru_get_return_address (int);
+extern int pru_hard_regno_rename_ok (unsigned int, unsigned int);
+
+extern const char * pru_output_sign_extend (rtx *);
+extern const char * pru_output_signed_cbranch (rtx *, bool);
+extern const char * pru_output_signed_cbranch_ubyteop2 (rtx *, bool);
+extern const char * pru_output_signed_cbranch_zeroop2 (rtx *, bool);
+
+extern rtx pru_expand_fp_compare (rtx comparison, machine_mode mode);
+
+extern void pru_emit_doloop (rtx *, int);
+
+extern bool pru_regno_ok_for_base_p (int, bool);
+static inline bool pru_regno_ok_for_index_p (int regno, bool strict_p)
+{
+  /* Selection logic is the same - PRU instructions are quite orthogonal.  */
+  return pru_regno_ok_for_base_p (regno, strict_p);
+}
+
+extern int pru_get_ctable_exact_base_index (unsigned HOST_WIDE_INT caddr);
+extern int pru_get_ctable_base_index (unsigned HOST_WIDE_INT caddr);
+extern int pru_get_ctable_base_offset (unsigned HOST_WIDE_INT caddr);
+
+extern void pru_register_abicheck_pass (void);
+#endif /* RTX_CODE */
+
+#ifdef TREE_CODE
+bool pru_return_in_memory (const_tree type, const_tree fntype ATTRIBUTE_UNUSED);
+#endif /* TREE_CODE */
+
+#endif /* GCC_PRU_PROTOS_H */
diff --git a/gcc/config/pru/pru.c b/gcc/config/pru/pru.c
new file mode 100644
index 00000000000..41b0a283d44
--- /dev/null
+++ b/gcc/config/pru/pru.c
@@ -0,0 +1,2985 @@ 
+/* Target machine subroutines for TI PRU.
+   Copyright (C) 2014-2018 Free Software Foundation, Inc.
+   Dimitar Dimitrov <dimitar@dinux.eu>
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+#define IN_TARGET_CODE 1
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "target.h"
+#include "rtl.h"
+#include "tree.h"
+#include "stringpool.h"
+#include "attribs.h"
+#include "df.h"
+#include "memmodel.h"
+#include "tm_p.h"
+#include "optabs.h"
+#include "regs.h"
+#include "emit-rtl.h"
+#include "recog.h"
+#include "diagnostic-core.h"
+#include "output.h"
+#include "insn-attr.h"
+#include "flags.h"
+#include "explow.h"
+#include "calls.h"
+#include "varasm.h"
+#include "expr.h"
+#include "toplev.h"
+#include "langhooks.h"
+#include "cfgrtl.h"
+#include "stor-layout.h"
+#include "dumpfile.h"
+#include "builtins.h"
+#include "pru-protos.h"
+
+/* This file should be included last.  */
+#include "target-def.h"
+
+#define INIT_ARRAY_ENTRY_BYTES	2
+
+/* Global PRU CTABLE entries, filled in by pragmas, and used for fast
+   addressing via LBCO/SBCO instructions.  */
+struct pru_ctable_entry pru_ctable[32];
+
+/* Forward function declarations.  */
+static bool prologue_saved_reg_p (unsigned);
+static void pru_reorg_loop (rtx_insn *);
+
+struct GTY (()) machine_function
+{
+  /* Current frame information, to be filled in by pru_compute_frame_layout
+     with register save masks, and offsets for the current function.  */
+
+  /* Mask of registers to save.  */
+  HARD_REG_SET save_mask;
+  /* Number of bytes that the entire frame takes up.  */
+  int total_size;
+  /* Number of bytes that variables take up.  */
+  int var_size;
+  /* Number of bytes that outgoing arguments take up.  */
+  int args_size;
+  /* Number of bytes needed to store registers in frame.  */
+  int save_reg_size;
+  /* Offset from new stack pointer to store registers.  */
+  int save_regs_offset;
+  /* Offset from save_regs_offset to store frame pointer register.  */
+  int fp_save_offset;
+  /* != 0 if frame layout already calculated.  */
+  int initialized;
+  /* Number of doloop tags used so far.  */
+  int doloop_tags;
+  /* True if the last tag was allocated to a doloop_end.  */
+  bool doloop_tag_from_end;
+};
+
+/* Stack layout and calling conventions.  */
+
+#define PRU_STACK_ALIGN(LOC)  ROUND_UP((LOC), STACK_BOUNDARY / BITS_PER_UNIT)
+
+/* Return the bytes needed to compute the frame pointer from the current
+   stack pointer.  */
+static int
+pru_compute_frame_layout (void)
+{
+  unsigned int regno;
+  HARD_REG_SET *save_mask;
+  int total_size;
+  int var_size;
+  int out_args_size;
+  int save_reg_size;
+
+  if (cfun->machine->initialized)
+    return cfun->machine->total_size;
+
+  save_mask = &cfun->machine->save_mask;
+  CLEAR_HARD_REG_SET (*save_mask);
+
+  var_size = PRU_STACK_ALIGN ((HOST_WIDE_INT) get_frame_size ());
+  out_args_size = PRU_STACK_ALIGN ((HOST_WIDE_INT) crtl->outgoing_args_size);
+  total_size = var_size + out_args_size;
+
+  /* Calculate space needed for gp registers.  */
+  save_reg_size = 0;
+  for (regno = 0; regno <= LAST_GP_REG; regno++)
+    if (prologue_saved_reg_p (regno))
+      {
+	SET_HARD_REG_BIT (*save_mask, regno);
+	save_reg_size += 1;
+      }
+
+  cfun->machine->fp_save_offset = 0;
+  if (TEST_HARD_REG_BIT (*save_mask, HARD_FRAME_POINTER_REGNUM))
+    {
+      int fp_save_offset = 0;
+      for (regno = 0; regno < HARD_FRAME_POINTER_REGNUM; regno++)
+	if (TEST_HARD_REG_BIT (*save_mask, regno))
+	  fp_save_offset += 1;
+
+      cfun->machine->fp_save_offset = fp_save_offset;
+    }
+
+  save_reg_size = PRU_STACK_ALIGN (save_reg_size);
+  total_size += save_reg_size;
+  total_size += PRU_STACK_ALIGN (crtl->args.pretend_args_size);
+
+  /* Save other computed information.  */
+  cfun->machine->total_size = total_size;
+  cfun->machine->var_size = var_size;
+  cfun->machine->args_size = out_args_size;
+  cfun->machine->save_reg_size = save_reg_size;
+  cfun->machine->initialized = reload_completed;
+  cfun->machine->save_regs_offset = out_args_size + var_size;
+
+  return total_size;
+}
+
+/* Emit efficient RTL equivalent of ADD3 with the given const_int for
+   frame-related registers.
+     op0	  - Destination register.
+     op1	  - First addendum operand (a register).
+     addendum     - Second addendum operand (a constant).
+     kind	  - Note kind.  REG_NOTE_MAX if no note must be added.
+     reg_note_rtx - Reg note RTX.  NULL if it should be computed automatically.
+ */
+static rtx
+pru_add3_frame_adjust (rtx op0, rtx op1, int addendum,
+		       const enum reg_note kind, rtx reg_note_rtx)
+{
+  rtx insn;
+
+  rtx op0_adjust = gen_rtx_SET (op0, plus_constant (Pmode, op1, addendum));
+
+  if (UBYTE_INT (addendum) || UBYTE_INT (-addendum))
+    insn = emit_insn (op0_adjust);
+  else
+    {
+      /* Help the compiler to cope with an arbitrary integer constant.
+	 Reload has finished so we can't expect the compiler to
+	 auto-allocate a temporary register.  But we know that call-saved
+	 registers are not live yet, so we utilize them.  */
+      rtx tmpreg = gen_rtx_REG (Pmode, PROLOGUE_TEMP_REGNO);
+      if (addendum < 0)
+	{
+	  emit_insn (gen_rtx_SET (tmpreg, gen_int_mode (-addendum, Pmode)));
+	  insn = emit_insn (gen_sub3_insn (op0, op1, tmpreg));
+	}
+      else
+	{
+	  emit_insn (gen_rtx_SET (tmpreg, gen_int_mode (addendum, Pmode)));
+	  insn = emit_insn (gen_add3_insn (op0, op1, tmpreg));
+	}
+    }
+
+  /* Attach a note indicating what happened.  */
+  if (reg_note_rtx == NULL_RTX)
+    reg_note_rtx = copy_rtx (op0_adjust);
+  if (kind != REG_NOTE_MAX)
+    add_reg_note (insn, kind, reg_note_rtx);
+
+  RTX_FRAME_RELATED_P (insn) = 1;
+
+  return insn;
+}
+
+/* Add a const_int to the stack pointer register.  */
+static rtx
+pru_add_to_sp (int addendum, const enum reg_note kind)
+{
+  return pru_add3_frame_adjust (stack_pointer_rtx, stack_pointer_rtx,
+				addendum, kind, NULL_RTX);
+}
+
+/* Helper function used during prologue/epilogue.  Emits a single LBBO/SBBO
+   instruction for load/store of the next group of consecutive registers.  */
+static int
+xbbo_next_reg_cluster (int regno_start, int *sp_offset, bool do_store)
+{
+  int regno, nregs, i;
+  rtx addr, insn, pat;
+
+  nregs = 0;
+
+  /* Skip the empty slots.  */
+  for (; regno_start <= LAST_GP_REG;)
+    if (TEST_HARD_REG_BIT (cfun->machine->save_mask, regno_start))
+      break;
+    else
+      regno_start++;
+
+  /* Find the largest consecutive group of registers to save.  */
+  for (regno = regno_start; regno <= LAST_GP_REG && nregs < MAX_XBBO_BURST_LEN;)
+    if (TEST_HARD_REG_BIT (cfun->machine->save_mask, regno))
+      {
+	regno++;
+	nregs++;
+      }
+    else
+      break;
+
+  if (!nregs)
+    return -1;
+
+  gcc_assert (UBYTE_INT (*sp_offset));
+
+  /* Ok, save this bunch.  */
+  addr = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
+		       gen_int_mode (*sp_offset, Pmode));
+
+  if (do_store)
+    insn = gen_store_multiple (gen_frame_mem (Pmode, addr),
+			       gen_rtx_REG (QImode, regno_start),
+			       GEN_INT (nregs));
+  else
+    insn = gen_load_multiple (gen_rtx_REG (QImode, regno_start),
+			      gen_frame_mem (Pmode, addr),
+			      GEN_INT (nregs));
+
+  gcc_assert (insn);
+  emit_insn (insn);
+
+  /* Tag as frame-related.  */
+  pat = PATTERN (insn);
+  for (i = 0; i < XVECLEN (pat, 0); i++)
+    if (GET_CODE (XVECEXP (pat, 0, i)) == SET)
+      {
+	RTX_FRAME_RELATED_P (XVECEXP (pat, 0, i)) = 1;
+	if (do_store)
+	  {
+	    /* Tag epilogue unwind note.  */
+	    rtx reg = SET_SRC (XVECEXP (pat, 0, i));
+	    add_reg_note (insn, REG_CFA_RESTORE, copy_rtx (reg));
+	  }
+      }
+
+  /* Increment and save offset in anticipation of the next register group.  */
+  *sp_offset += nregs * UNITS_PER_WORD;
+
+  return regno_start + nregs;
+}
+
+void
+pru_expand_prologue (void)
+{
+  int regno_start;
+  int total_frame_size;
+  int sp_offset;      /* Offset from base_reg to final stack value.  */
+  int save_regs_base; /* Offset from base_reg to register save area.  */
+  int save_offset;    /* Temporary offset to currently saved register group.  */
+  rtx insn;
+
+  total_frame_size = pru_compute_frame_layout ();
+
+  if (flag_stack_usage_info)
+    current_function_static_stack_size = total_frame_size;
+
+  /* Decrement the stack pointer.  */
+  if (!UBYTE_INT (total_frame_size))
+    {
+      /* We need an intermediary point, this will point at the spill block.  */
+      insn = pru_add_to_sp (cfun->machine->save_regs_offset
+			     - total_frame_size,
+			     REG_NOTE_MAX);
+      save_regs_base = 0;
+      sp_offset = -cfun->machine->save_regs_offset;
+    }
+  else if (total_frame_size)
+    {
+      insn = emit_insn (gen_sub2_insn (stack_pointer_rtx,
+				       gen_int_mode (total_frame_size,
+						     Pmode)));
+      RTX_FRAME_RELATED_P (insn) = 1;
+      save_regs_base = cfun->machine->save_regs_offset;
+      sp_offset = 0;
+    }
+  else
+    save_regs_base = sp_offset = 0;
+
+  regno_start = 0;
+  save_offset = save_regs_base;
+  do {
+      regno_start = xbbo_next_reg_cluster (regno_start, &save_offset, true);
+  } while (regno_start >= 0);
+
+  if (frame_pointer_needed)
+    {
+      int fp_save_offset = save_regs_base + cfun->machine->fp_save_offset;
+      pru_add3_frame_adjust (hard_frame_pointer_rtx,
+				    stack_pointer_rtx,
+				    fp_save_offset,
+				    REG_NOTE_MAX,
+				    NULL_RTX);
+    }
+
+  if (sp_offset)
+      pru_add_to_sp (sp_offset, REG_FRAME_RELATED_EXPR);
+
+  /* If we are profiling, make sure no instructions are scheduled before
+     the call to mcount.  */
+  if (crtl->profile)
+    emit_insn (gen_blockage ());
+}
+
+void
+pru_expand_epilogue (bool sibcall_p)
+{
+  rtx cfa_adj;
+  int total_frame_size;
+  int sp_adjust, save_offset;
+  int regno_start;
+
+  if (!sibcall_p && pru_can_use_return_insn ())
+    {
+      emit_jump_insn (gen_return ());
+      return;
+    }
+
+  emit_insn (gen_blockage ());
+
+  total_frame_size = pru_compute_frame_layout ();
+  if (frame_pointer_needed)
+    {
+      /* Recover the stack pointer.  */
+      cfa_adj = plus_constant (Pmode, stack_pointer_rtx,
+			       (total_frame_size
+				- cfun->machine->save_regs_offset));
+      pru_add3_frame_adjust (stack_pointer_rtx,
+			     hard_frame_pointer_rtx,
+			     -cfun->machine->fp_save_offset,
+			     REG_CFA_DEF_CFA,
+			     cfa_adj);
+
+      save_offset = 0;
+      sp_adjust = total_frame_size - cfun->machine->save_regs_offset;
+    }
+  else if (!UBYTE_INT (total_frame_size))
+    {
+      pru_add_to_sp (cfun->machine->save_regs_offset,
+			    REG_CFA_ADJUST_CFA);
+      save_offset = 0;
+      sp_adjust = total_frame_size - cfun->machine->save_regs_offset;
+    }
+  else
+    {
+      save_offset = cfun->machine->save_regs_offset;
+      sp_adjust = total_frame_size;
+    }
+
+  regno_start = 0;
+  do {
+      regno_start = xbbo_next_reg_cluster (regno_start, &save_offset, false);
+  } while (regno_start >= 0);
+
+  if (sp_adjust)
+      pru_add_to_sp (sp_adjust, REG_CFA_ADJUST_CFA);
+
+  if (!sibcall_p)
+    emit_jump_insn (gen_simple_return ());
+}
+
+/* Implement RETURN_ADDR_RTX.  Note, we do not support moving
+   back to a previous frame.  */
+rtx
+pru_get_return_address (int count)
+{
+  if (count != 0)
+    return NULL_RTX;
+
+  /* Return r3.w2.  */
+  return get_hard_reg_initial_val (HImode, RA_REGNO);
+}
+
+/* Implement FUNCTION_PROFILER macro.  */
+void
+pru_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED)
+{
+  fprintf (file, "\tmov\tr1, ra\n");
+  fprintf (file, "\tcall\t_mcount\n");
+  fprintf (file, "\tmov\tra, r1\n");
+}
+
+/* Dump stack layout.  */
+static void
+pru_dump_frame_layout (FILE *file)
+{
+  fprintf (file, "\t%s Current Frame Info\n", ASM_COMMENT_START);
+  fprintf (file, "\t%s total_size = %d\n", ASM_COMMENT_START,
+	   cfun->machine->total_size);
+  fprintf (file, "\t%s var_size = %d\n", ASM_COMMENT_START,
+	   cfun->machine->var_size);
+  fprintf (file, "\t%s args_size = %d\n", ASM_COMMENT_START,
+	   cfun->machine->args_size);
+  fprintf (file, "\t%s save_reg_size = %d\n", ASM_COMMENT_START,
+	   cfun->machine->save_reg_size);
+  fprintf (file, "\t%s initialized = %d\n", ASM_COMMENT_START,
+	   cfun->machine->initialized);
+  fprintf (file, "\t%s save_regs_offset = %d\n", ASM_COMMENT_START,
+	   cfun->machine->save_regs_offset);
+  fprintf (file, "\t%s is_leaf = %d\n", ASM_COMMENT_START,
+	   crtl->is_leaf);
+  fprintf (file, "\t%s frame_pointer_needed = %d\n", ASM_COMMENT_START,
+	   frame_pointer_needed);
+  fprintf (file, "\t%s pretend_args_size = %d\n", ASM_COMMENT_START,
+	   crtl->args.pretend_args_size);
+}
+
+/* Return true if REGNO should be saved in the prologue.  */
+static bool
+prologue_saved_reg_p (unsigned regno)
+{
+  gcc_assert (GP_REG_P (regno));
+
+  if (df_regs_ever_live_p (regno) && !call_used_regs[regno])
+    return true;
+
+  /* 32-bit FP.  */
+  if (frame_pointer_needed
+      && regno >= HARD_FRAME_POINTER_REGNUM
+      && regno < HARD_FRAME_POINTER_REGNUM + GET_MODE_SIZE (Pmode))
+    return true;
+
+  /* 16-bit RA.  */
+  if (regno == RA_REGNO && df_regs_ever_live_p (RA_REGNO))
+    return true;
+  if (regno == RA_REGNO + 1 && df_regs_ever_live_p (RA_REGNO + 1))
+    return true;
+
+  return false;
+}
+
+/* Implement TARGET_CAN_ELIMINATE.  */
+static bool
+pru_can_eliminate (const int from ATTRIBUTE_UNUSED, const int to)
+{
+  if (to == STACK_POINTER_REGNUM)
+    return !frame_pointer_needed;
+  return true;
+}
+
+/* Implement INITIAL_ELIMINATION_OFFSET macro.  */
+int
+pru_initial_elimination_offset (int from, int to)
+{
+  int offset;
+
+  pru_compute_frame_layout ();
+
+  /* Set OFFSET to the offset from the stack pointer.  */
+  switch (from)
+    {
+    case FRAME_POINTER_REGNUM:
+      offset = cfun->machine->args_size;
+      break;
+
+    case ARG_POINTER_REGNUM:
+      offset = cfun->machine->total_size;
+      offset -= crtl->args.pretend_args_size;
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+
+  /* If we are asked for the frame pointer offset, then adjust OFFSET
+     by the offset from the frame pointer to the stack pointer.  */
+  if (to == HARD_FRAME_POINTER_REGNUM)
+    offset -= (cfun->machine->save_regs_offset
+	       + cfun->machine->fp_save_offset);
+
+  return offset;
+}
+
+/* Return nonzero if this function is known to have a null epilogue.
+   This allows the optimizer to omit jumps to jumps if no stack
+   was created.  */
+int
+pru_can_use_return_insn (void)
+{
+  if (!reload_completed || crtl->profile)
+    return 0;
+
+  return pru_compute_frame_layout () == 0;
+}
+
+/* Implement TARGET_MODES_TIEABLE_P.  */
+
+static bool
+pru_modes_tieable_p (machine_mode mode1, machine_mode mode2)
+{
+  return (mode1 == mode2
+	  || (GET_MODE_SIZE (mode1) <= 4 && GET_MODE_SIZE (mode2) <= 4));
+}
+
+/* Implement TARGET_HARD_REGNO_MODE_OK.  */
+
+static bool
+pru_hard_regno_mode_ok (unsigned int regno, machine_mode mode)
+{
+  switch (GET_MODE_SIZE (mode))
+    {
+    case 1: return true;
+    case 2: return (regno % 4) <= 2;
+    case 4: return (regno % 4) == 0;
+    default: return (regno % 4) == 0; /* Not sure why VOIDmode is passed.  */
+    }
+}
+
+/* Implement `TARGET_HARD_REGNO_SCRATCH_OK'.  */
+/* Returns true if SCRATCH are safe to be allocated as a scratch
+   registers (for a define_peephole2) in the current function.  */
+
+static bool
+pru_hard_regno_scratch_ok (unsigned int regno)
+{
+  /* Don't allow hard registers that might be part of the frame pointer.
+     Some places in the compiler just test for [HARD_]FRAME_POINTER_REGNUM
+     and don't handle a frame pointer that spans more than one register.  */
+
+  if ((!reload_completed || frame_pointer_needed)
+      && ((regno >= HARD_FRAME_POINTER_REGNUM
+	   && regno <= HARD_FRAME_POINTER_REGNUM + 3)
+	  || (regno >= FRAME_POINTER_REGNUM
+	      && regno <= FRAME_POINTER_REGNUM + 3)))
+    {
+      return false;
+    }
+
+  return true;
+}
+
+
+/* Worker function for `HARD_REGNO_RENAME_OK'.  */
+/* Return nonzero if register OLD_REG can be renamed to register NEW_REG.  */
+
+int
+pru_hard_regno_rename_ok (unsigned int old_reg,
+			  unsigned int new_reg)
+{
+  /* Don't allow hard registers that might be part of the frame pointer.
+     Some places in the compiler just test for [HARD_]FRAME_POINTER_REGNUM
+     and don't care for a frame pointer that spans more than one register.  */
+  if ((!reload_completed || frame_pointer_needed)
+      && ((old_reg >= HARD_FRAME_POINTER_REGNUM
+	   && old_reg <= HARD_FRAME_POINTER_REGNUM + 3)
+	  || (old_reg >= FRAME_POINTER_REGNUM
+	      && old_reg <= FRAME_POINTER_REGNUM + 3)
+	  || (new_reg >= HARD_FRAME_POINTER_REGNUM
+	      && new_reg <= HARD_FRAME_POINTER_REGNUM + 3)
+	  || (new_reg >= FRAME_POINTER_REGNUM
+	      && new_reg <= FRAME_POINTER_REGNUM + 3)))
+    {
+      return 0;
+    }
+
+  return 1;
+}
+
+static void
+pru_init (void)
+{
+  /* Due to difficulties in implementing the TI ABI with GCC,
+     at least check and error-out if GCC cannot compile a
+     compliant output.  */
+  pru_register_abicheck_pass ();
+}
+
+/* Allocate a chunk of memory for per-function machine-dependent data.  */
+static struct machine_function *
+pru_init_machine_status (void)
+{
+  return ggc_cleared_alloc<machine_function> ();
+}
+
+/* Implement TARGET_OPTION_OVERRIDE.  */
+static void
+pru_option_override (void)
+{
+#ifdef SUBTARGET_OVERRIDE_OPTIONS
+  SUBTARGET_OVERRIDE_OPTIONS;
+#endif
+
+  /* Unwind tables currently require a frame pointer for correctness,
+     see toplev.c:process_options().  */
+
+  if ((flag_unwind_tables
+       || flag_non_call_exceptions
+       || flag_asynchronous_unwind_tables)
+      && !ACCUMULATE_OUTGOING_ARGS)
+    {
+      flag_omit_frame_pointer = 0;
+    }
+
+  /* Check for unsupported options.  */
+  if (flag_pic == 1)
+    warning (OPT_fpic, "-fpic is not supported");
+  if (flag_pic == 2)
+    warning (OPT_fPIC, "-fPIC is not supported");
+  if (flag_pie == 1)
+    warning (OPT_fpie, "-fpie is not supported");
+  if (flag_pie == 2)
+    warning (OPT_fPIE, "-fPIE is not supported");
+
+  /* QBxx conditional branching cannot cope with block reordering.  */
+  if (flag_reorder_blocks_and_partition)
+    {
+      inform (input_location, "-freorder-blocks-and-partition not supported "
+			      "on this architecture");
+      flag_reorder_blocks_and_partition = 0;
+      flag_reorder_blocks = 1;
+    }
+
+  /* Function to allocate machine-dependent function status.  */
+  init_machine_status = &pru_init_machine_status;
+
+  /* Save the initial options in case the user does function specific
+     options.  */
+  target_option_default_node = target_option_current_node
+    = build_target_option_node (&global_options);
+
+  /* This needs to be done at start up.  It's convenient to do it here.  */
+  pru_init ();
+}
+
+/* Compute a (partial) cost for rtx X.  Return true if the complete
+   cost has been computed, and false if subexpressions should be
+   scanned.  In either case, *TOTAL contains the cost result.  */
+static bool
+pru_rtx_costs (rtx x, machine_mode mode ATTRIBUTE_UNUSED,
+	       int outer_code, int opno ATTRIBUTE_UNUSED,
+	       int *total, bool speed ATTRIBUTE_UNUSED)
+{
+  const int code = GET_CODE (x);
+
+  switch (code)
+    {
+      case CONST_INT:
+	if (UBYTE_INT (INTVAL (x)))
+	  {
+	    *total = COSTS_N_INSNS (0);
+	    return true;
+	  }
+	else if (outer_code == MEM && ctable_addr_operand (x, VOIDmode))
+	  {
+	    *total = COSTS_N_INSNS (0);
+	    return true;
+	  }
+	else if (UHWORD_INT (INTVAL (x)))
+	  {
+	    *total = COSTS_N_INSNS (1);
+	    return true;
+	  }
+	else
+	  {
+	    *total = COSTS_N_INSNS (2);
+	    return true;
+	  }
+
+      case LABEL_REF:
+      case SYMBOL_REF:
+      case CONST:
+      case CONST_DOUBLE:
+	  {
+	    *total = COSTS_N_INSNS (1);
+	    return true;
+	  }
+    case SET:
+	{
+	  /* A SET doesn't have a mode, so let's look at the SET_DEST to get
+	     the mode for the factor.  */
+	  mode = GET_MODE (SET_DEST (x));
+
+	  if (GET_MODE_SIZE (mode) <= GET_MODE_SIZE (SImode)
+	      && (GET_CODE (SET_SRC (x)) == ZERO_EXTEND
+		  || outer_code == ZERO_EXTEND))
+	    {
+	      *total = 0;
+	    }
+	  else
+	    {
+	      /* SI move has the same cost as a QI move.  */
+	      int factor = GET_MODE_SIZE (mode) / GET_MODE_SIZE (SImode);
+	      if (factor == 0)
+		factor = 1;
+	      *total = factor * COSTS_N_INSNS (1);
+	    }
+
+	  return false;
+	}
+
+      case MULT:
+	{
+	  *total = COSTS_N_INSNS (8);
+	  return false;
+	}
+      case PLUS:
+	{
+	  rtx op0 = XEXP (x, 0);
+	  rtx op1 = XEXP (x, 1);
+	  if (outer_code == MEM
+	      && ((REG_P (op0) && reg_or_ubyte_operand (op1, VOIDmode))
+		  || (REG_P (op1) && reg_or_ubyte_operand (op0, VOIDmode))
+		  || (ctable_addr_operand (op0, VOIDmode) && op1 == NULL_RTX)
+		  || (ctable_addr_operand (op1, VOIDmode) && op0 == NULL_RTX)
+		  || (ctable_base_operand (op0, VOIDmode) && REG_P (op1))
+		  || (ctable_base_operand (op1, VOIDmode) && REG_P (op0))))
+	    {
+	      /* CTABLE or REG base addressing - PLUS comes for free.  */
+	      *total = COSTS_N_INSNS (0);
+	      return true;
+	    }
+	  else
+	    {
+	      *total = COSTS_N_INSNS (1);
+	      return false;
+	    }
+	}
+      case SIGN_EXTEND:
+	{
+	  *total = COSTS_N_INSNS (3);
+	  return false;
+	}
+      case ASHIFTRT:
+	{
+	  rtx op1 = XEXP (x, 1);
+	  if (const_1_operand (op1, VOIDmode))
+	    *total = COSTS_N_INSNS (2);
+	  else
+	    *total = COSTS_N_INSNS (6);
+	  return false;
+	}
+      case ZERO_EXTRACT:
+	{
+	  rtx op2 = XEXP (x, 2);
+	  if ((outer_code == EQ || outer_code == NE)
+	      && CONST_INT_P (op2)
+	      && INTVAL (op2) == 1)
+	    {
+	      /* Branch if bit is set/clear is a single instruction.  */
+	      *total = COSTS_N_INSNS (0);
+	      return true;
+	    }
+	  else
+	    {
+	      *total = COSTS_N_INSNS (2);
+	      return false;
+	    }
+	}
+
+      case ZERO_EXTEND:
+	{
+	  *total = COSTS_N_INSNS (0);
+	  return false;
+	}
+
+      default:
+	{
+	  /* Do not factor mode size in the cost.  */
+	  *total = COSTS_N_INSNS (1);
+	  return false;
+	}
+    }
+}
+
+/* Implement TARGET_PREFERRED_RELOAD_CLASS.  */
+static reg_class_t
+pru_preferred_reload_class (rtx x ATTRIBUTE_UNUSED, reg_class_t regclass)
+{
+  return regclass == NO_REGS ? GENERAL_REGS : regclass;
+}
+
+static GTY(()) rtx eqdf_libfunc;
+static GTY(()) rtx nedf_libfunc;
+static GTY(()) rtx ledf_libfunc;
+static GTY(()) rtx ltdf_libfunc;
+static GTY(()) rtx gedf_libfunc;
+static GTY(()) rtx gtdf_libfunc;
+static GTY(()) rtx eqsf_libfunc;
+static GTY(()) rtx nesf_libfunc;
+static GTY(()) rtx lesf_libfunc;
+static GTY(()) rtx ltsf_libfunc;
+static GTY(()) rtx gesf_libfunc;
+static GTY(()) rtx gtsf_libfunc;
+
+/* Implement the TARGET_INIT_LIBFUNCS macro.  We use this to rename library
+   functions to match the PRU ABI.  */
+
+static void
+pru_init_libfuncs (void)
+{
+  /* Double-precision floating-point arithmetic.  */
+  set_optab_libfunc (add_optab, DFmode, "__pruabi_addd");
+  set_optab_libfunc (sdiv_optab, DFmode, "__pruabi_divd");
+  set_optab_libfunc (smul_optab, DFmode, "__pruabi_mpyd");
+  set_optab_libfunc (neg_optab, DFmode, "__pruabi_negd");
+  set_optab_libfunc (sub_optab, DFmode, "__pruabi_subd");
+
+  /* Single-precision floating-point arithmetic.  */
+  set_optab_libfunc (add_optab, SFmode, "__pruabi_addf");
+  set_optab_libfunc (sdiv_optab, SFmode, "__pruabi_divf");
+  set_optab_libfunc (smul_optab, SFmode, "__pruabi_mpyf");
+  set_optab_libfunc (neg_optab, SFmode, "__pruabi_negf");
+  set_optab_libfunc (sub_optab, SFmode, "__pruabi_subf");
+
+  /* Floating-point comparisons.  */
+  eqsf_libfunc = init_one_libfunc ("__pruabi_eqf");
+  nesf_libfunc = init_one_libfunc ("__pruabi_neqf");
+  lesf_libfunc = init_one_libfunc ("__pruabi_lef");
+  ltsf_libfunc = init_one_libfunc ("__pruabi_ltf");
+  gesf_libfunc = init_one_libfunc ("__pruabi_gef");
+  gtsf_libfunc = init_one_libfunc ("__pruabi_gtf");
+  eqdf_libfunc = init_one_libfunc ("__pruabi_eqd");
+  nedf_libfunc = init_one_libfunc ("__pruabi_neqd");
+  ledf_libfunc = init_one_libfunc ("__pruabi_led");
+  ltdf_libfunc = init_one_libfunc ("__pruabi_ltd");
+  gedf_libfunc = init_one_libfunc ("__pruabi_ged");
+  gtdf_libfunc = init_one_libfunc ("__pruabi_gtd");
+
+  set_optab_libfunc (eq_optab, SFmode, NULL);
+  set_optab_libfunc (ne_optab, SFmode, "__pruabi_neqf");
+  set_optab_libfunc (gt_optab, SFmode, NULL);
+  set_optab_libfunc (ge_optab, SFmode, NULL);
+  set_optab_libfunc (lt_optab, SFmode, NULL);
+  set_optab_libfunc (le_optab, SFmode, NULL);
+  set_optab_libfunc (unord_optab, SFmode, "__pruabi_unordf");
+  set_optab_libfunc (eq_optab, DFmode, NULL);
+  set_optab_libfunc (ne_optab, DFmode, "__pruabi_neqd");
+  set_optab_libfunc (gt_optab, DFmode, NULL);
+  set_optab_libfunc (ge_optab, DFmode, NULL);
+  set_optab_libfunc (lt_optab, DFmode, NULL);
+  set_optab_libfunc (le_optab, DFmode, NULL);
+  set_optab_libfunc (unord_optab, DFmode, "__pruabi_unordd");
+
+  /* Floating-point to integer conversions.  */
+  set_conv_libfunc (sfix_optab, SImode, DFmode, "__pruabi_fixdi");
+  set_conv_libfunc (ufix_optab, SImode, DFmode, "__pruabi_fixdu");
+  set_conv_libfunc (sfix_optab, DImode, DFmode, "__pruabi_fixdlli");
+  set_conv_libfunc (ufix_optab, DImode, DFmode, "__pruabi_fixdull");
+  set_conv_libfunc (sfix_optab, SImode, SFmode, "__pruabi_fixfi");
+  set_conv_libfunc (ufix_optab, SImode, SFmode, "__pruabi_fixfu");
+  set_conv_libfunc (sfix_optab, DImode, SFmode, "__pruabi_fixflli");
+  set_conv_libfunc (ufix_optab, DImode, SFmode, "__pruabi_fixfull");
+
+  /* Conversions between floating types.  */
+  set_conv_libfunc (trunc_optab, SFmode, DFmode, "__pruabi_cvtdf");
+  set_conv_libfunc (sext_optab, DFmode, SFmode, "__pruabi_cvtfd");
+
+  /* Integer to floating-point conversions.  */
+  set_conv_libfunc (sfloat_optab, DFmode, SImode, "__pruabi_fltid");
+  set_conv_libfunc (ufloat_optab, DFmode, SImode, "__pruabi_fltud");
+  set_conv_libfunc (sfloat_optab, DFmode, DImode, "__pruabi_fltllid");
+  set_conv_libfunc (ufloat_optab, DFmode, DImode, "__pruabi_fltulld");
+  set_conv_libfunc (sfloat_optab, SFmode, SImode, "__pruabi_fltif");
+  set_conv_libfunc (ufloat_optab, SFmode, SImode, "__pruabi_fltuf");
+  set_conv_libfunc (sfloat_optab, SFmode, DImode, "__pruabi_fltllif");
+  set_conv_libfunc (ufloat_optab, SFmode, DImode, "__pruabi_fltullf");
+
+  /* Long long.  */
+  set_optab_libfunc (ashr_optab, DImode, "__pruabi_asrll");
+  set_optab_libfunc (smul_optab, DImode, "__pruabi_mpyll");
+  set_optab_libfunc (ashl_optab, DImode, "__pruabi_lslll");
+  set_optab_libfunc (lshr_optab, DImode, "__pruabi_lsrll");
+
+  set_optab_libfunc (sdiv_optab, SImode, "__pruabi_divi");
+  set_optab_libfunc (udiv_optab, SImode, "__pruabi_divu");
+  set_optab_libfunc (smod_optab, SImode, "__pruabi_remi");
+  set_optab_libfunc (umod_optab, SImode, "__pruabi_remu");
+  set_optab_libfunc (sdivmod_optab, SImode, "__pruabi_divremi");
+  set_optab_libfunc (udivmod_optab, SImode, "__pruabi_divremu");
+  set_optab_libfunc (sdiv_optab, DImode, "__pruabi_divlli");
+  set_optab_libfunc (udiv_optab, DImode, "__pruabi_divull");
+  set_optab_libfunc (smod_optab, DImode, "__pruabi_remlli");
+  set_optab_libfunc (umod_optab, DImode, "__pruabi_remull");
+  set_optab_libfunc (udivmod_optab, DImode, "__pruabi_divremull");
+}
+
+
+/* Emit comparison instruction if necessary, returning the expression
+   that holds the compare result in the proper mode.  Return the comparison
+   that should be used in the jump insn.  */
+
+rtx
+pru_expand_fp_compare (rtx comparison, machine_mode mode)
+{
+  enum rtx_code code = GET_CODE (comparison);
+  rtx op0 = XEXP (comparison, 0);
+  rtx op1 = XEXP (comparison, 1);
+  rtx cmp;
+  enum rtx_code jump_code = code;
+  machine_mode op_mode = GET_MODE (op0);
+  rtx_insn *insns;
+  rtx libfunc;
+
+  gcc_assert (op_mode == DFmode || op_mode == SFmode);
+
+  if (code == UNGE)
+    {
+      code = LT;
+      jump_code = EQ;
+    }
+  else if (code == UNLE)
+    {
+      code = GT;
+      jump_code = EQ;
+    }
+  else
+    jump_code = NE;
+
+  switch (code)
+    {
+    case EQ:
+      libfunc = op_mode == DFmode ? eqdf_libfunc : eqsf_libfunc;
+      break;
+    case NE:
+      libfunc = op_mode == DFmode ? nedf_libfunc : nesf_libfunc;
+      break;
+    case GT:
+      libfunc = op_mode == DFmode ? gtdf_libfunc : gtsf_libfunc;
+      break;
+    case GE:
+      libfunc = op_mode == DFmode ? gedf_libfunc : gesf_libfunc;
+      break;
+    case LT:
+      libfunc = op_mode == DFmode ? ltdf_libfunc : ltsf_libfunc;
+      break;
+    case LE:
+      libfunc = op_mode == DFmode ? ledf_libfunc : lesf_libfunc;
+      break;
+    default:
+      gcc_unreachable ();
+    }
+  start_sequence ();
+
+  cmp = emit_library_call_value (libfunc, 0, LCT_CONST, SImode,
+				 op0, op_mode, op1, op_mode);
+  insns = get_insns ();
+  end_sequence ();
+
+  emit_libcall_block (insns, cmp, cmp,
+		      gen_rtx_fmt_ee (code, SImode, op0, op1));
+
+  return gen_rtx_fmt_ee (jump_code, mode, cmp, const0_rtx);
+}
+
+/* Sign extension.  */
+
+static int sign_bit_position (const rtx op)
+{
+  const int sz = GET_MODE_SIZE (GET_MODE (op));
+
+  return  sz * 8 - 1;
+}
+
+const char *
+pru_output_sign_extend (rtx *operands)
+{
+  static char buf[512];
+  int bufi;
+  const int dst_sz = GET_MODE_SIZE (GET_MODE (operands[0]));
+  const int src_sz = GET_MODE_SIZE (GET_MODE (operands[1]));
+  char ext_start;
+
+  switch (src_sz)
+    {
+    case 1: ext_start = 'y'; break;
+    case 2: ext_start = 'z'; break;
+    default: gcc_unreachable ();
+    }
+
+  gcc_assert (dst_sz > src_sz);
+
+  /* Note that src and dst can be different parts of the same
+     register, e.g. "r7, r7.w1".  */
+  bufi = snprintf (buf, sizeof (buf),
+	  "mov\t%%0, %%1\n\t"		      /* Copy AND make positive.  */
+	  "qbbc\t.+8, %%0, %d\n\t"	      /* Check sign bit.  */
+	  "fill\t%%%c0, %d",		      /* Make negative.  */
+	  sign_bit_position (operands[1]),
+	  ext_start,
+	  dst_sz - src_sz);
+
+  gcc_assert (bufi > 0);
+  gcc_assert ((unsigned int)bufi < sizeof (buf));
+
+  return buf;
+}
+
+/* Branches and compares.  */
+
+/* PRU's ALU does not support signed comparison operations.  That's why we
+   emulate them.  By first checking the sign bit and handling every possible
+   operand sign combination, we can simulate signed comparisons in just
+   5 instructions.  See table below.
+
+.-------------------.---------------------------------------------------.
+| Operand sign bit  | Mapping the signed comparison to an unsigned one  |
+|---------+---------+------------+------------+------------+------------|
+| OP1.b31 | OP2.b31 | OP1 < OP2  | OP1 <= OP2 | OP1 > OP2  | OP1 >= OP2 |
+|---------+---------+------------+------------+------------+------------|
+| 0       | 0       | OP1 < OP2  | OP1 <= OP2 | OP1 > OP2  | OP1 >= OP2 |
+|---------+---------+------------+------------+------------+------------|
+| 0       | 1       | false      | false      | true       | true       |
+|---------+---------+------------+------------+------------+------------|
+| 1       | 0       | true       | true       | false      | false      |
+|---------+---------+------------+------------+------------+------------|
+| 1       | 1       | OP1 < OP2  | OP1 <= OP2 | OP1 > OP2  | OP1 >= OP2 |
+`---------'---------'------------'------------'------------+------------'
+
+
+Given the table above, here is an example for a concrete op:
+  LT:
+		    qbbc OP1_POS, OP1, 31
+  OP1_NEG:	    qbbc BRANCH_TAKEN_LABEL, OP2, 31
+  OP1_NEG_OP2_NEG:  qblt BRANCH_TAKEN_LABEL, OP2, OP1
+		    ; jmp OUT -> can be eliminated because we'll take the
+		    ; following branch.  OP2.b31 is guaranteed to be 1
+		    ; by the time we get here.
+  OP1_POS:	    qbbs OUT, OP2, 31
+  OP1_POS_OP2_POS:  qblt BRANCH_TAKEN_LABEL, OP2, OP1
+#if FAR_JUMP
+		    jmp OUT
+BRANCH_TAKEN_LABEL: jmp REAL_BRANCH_TAKEN_LABEL
+#endif
+  OUT:
+
+*/
+
+static const char *
+pru_output_ltle_signed_cbranch (rtx *operands, bool is_near)
+{
+  static char buf[1024];
+  enum rtx_code code = GET_CODE (operands[0]);
+  rtx op1;
+  rtx op2;
+  const char *cmp_opstr;
+  int bufi = 0;
+
+  op1 = operands[1];
+  op2 = operands[2];
+
+  gcc_assert (GET_CODE (op1) == REG && GET_CODE (op2) == REG);
+
+  /* Determine the comparison operators for positive and negative operands.  */
+  if (code == LT)
+      cmp_opstr = "qblt";
+  else if (code == LE)
+      cmp_opstr = "qble";
+  else
+      gcc_unreachable ();
+
+  if (is_near)
+    {
+      bufi = snprintf (buf, sizeof (buf),
+		       "qbbc\t.+12, %%1, %d\n\t"
+		       "qbbc\t%%l3, %%2, %d\n\t"  /* OP1_NEG.  */
+		       "%s\t%%l3, %%2, %%1\n\t"   /* OP1_NEG_OP2_NEG.  */
+		       "qbbs\t.+8, %%2, %d\n\t"   /* OP1_POS.  */
+		       "%s\t%%l3, %%2, %%1",	  /* OP1_POS_OP2_POS.  */
+		       sign_bit_position (op1),
+		       sign_bit_position (op2),
+		       cmp_opstr,
+		       sign_bit_position (op2),
+		       cmp_opstr);
+    }
+  else
+    {
+      bufi = snprintf (buf, sizeof (buf),
+		       "qbbc\t.+12, %%1, %d\n\t"
+		       "qbbc\t.+20, %%2, %d\n\t"  /* OP1_NEG.  */
+		       "%s\t.+16, %%2, %%1\n\t"   /* OP1_NEG_OP2_NEG.  */
+		       "qbbs\t.+16, %%2, %d\n\t"  /* OP1_POS.  */
+		       "%s\t.+8, %%2, %%1\n\t"    /* OP1_POS_OP2_POS.  */
+		       "jmp\t.+8\n\t"		  /* jmp OUT.  */
+		       "jmp\t%%%%label(%%l3)",	  /* BRANCH_TAKEN_LABEL.  */
+		       sign_bit_position (op1),
+		       sign_bit_position (op2),
+		       cmp_opstr,
+		       sign_bit_position (op2),
+		       cmp_opstr);
+    }
+
+  gcc_assert (bufi > 0);
+  gcc_assert ((unsigned int)bufi < sizeof (buf));
+
+  return buf;
+}
+
+static const char *
+pru_output_gtge_signed_cbranch (rtx *operands, bool is_near)
+{
+  static char buf[1024];
+  enum rtx_code code = GET_CODE (operands[0]);
+  rtx op1;
+  rtx op2;
+  const char *cmp_opstr;
+  int bufi = 0;
+
+  op1 = operands[1];
+  op2 = operands[2];
+
+  gcc_assert (GET_CODE (op1) == REG && GET_CODE (op2) == REG);
+
+  /* Determine the comparison operators for positive and negative operands.  */
+  if (code == GT)
+      cmp_opstr = "qbgt";
+  else if (code == GE)
+      cmp_opstr = "qbge";
+  else
+    gcc_unreachable ();
+
+  if (is_near)
+    {
+      bufi = snprintf (buf, sizeof (buf),
+		       "qbbs\t.+12, %%1, %d\n\t"
+		       "qbbs\t%%l3, %%2, %d\n\t"  /* OP1_POS.  */
+		       "%s\t%%l3, %%2, %%1\n\t"   /* OP1_POS_OP2_POS.  */
+		       "qbbc\t.+8, %%2, %d\n\t"   /* OP1_NEG.  */
+		       "%s\t%%l3, %%2, %%1",      /* OP1_NEG_OP2_NEG.  */
+		       sign_bit_position (op1),
+		       sign_bit_position (op2),
+		       cmp_opstr,
+		       sign_bit_position (op2),
+		       cmp_opstr);
+    }
+  else
+    {
+      bufi = snprintf (buf, sizeof (buf),
+		       "qbbs\t.+12, %%1, %d\n\t"
+		       "qbbs\t.+20, %%2, %d\n\t"  /* OP1_POS.  */
+		       "%s\t.+16, %%2, %%1\n\t"   /* OP1_POS_OP2_POS.  */
+		       "qbbc\t.+16, %%2, %d\n\t"  /* OP1_NEG.  */
+		       "%s\t.+8, %%2, %%1\n\t"    /* OP1_NEG_OP2_NEG.  */
+		       "jmp\t.+8\n\t"		  /* jmp OUT.  */
+		       "jmp\t%%%%label(%%l3)",	  /* BRANCH_TAKEN_LABEL.  */
+		       sign_bit_position (op1),
+		       sign_bit_position (op2),
+		       cmp_opstr,
+		       sign_bit_position (op2),
+		       cmp_opstr);
+    }
+
+  gcc_assert (bufi > 0);
+  gcc_assert ((unsigned int)bufi < sizeof (buf));
+
+  return buf;
+}
+
+const char *
+pru_output_signed_cbranch (rtx *operands, bool is_near)
+{
+  enum rtx_code code = GET_CODE (operands[0]);
+
+  if (code == LT || code == LE)
+    return pru_output_ltle_signed_cbranch (operands, is_near);
+  else if (code == GT || code == GE)
+    return pru_output_gtge_signed_cbranch (operands, is_near);
+  else
+      gcc_unreachable ();
+}
+
+/*
+   Optimized version of pru_output_signed_cbranch for constant second
+   operand.  */
+
+const char *
+pru_output_signed_cbranch_ubyteop2 (rtx *operands, bool is_near)
+{
+  static char buf[1024];
+  enum rtx_code code = GET_CODE (operands[0]);
+  int regop_sign_bit_pos = sign_bit_position (operands[1]);
+  const char *cmp_opstr;
+  const char *rcmp_opstr;
+
+  /* We must swap operands due to PRU's demand OP1 to be the immediate.  */
+  code = swap_condition (code);
+
+  /* Determine normal and reversed comparison operators for both positive
+     operands.  This enables us to go completely unsigned.
+
+     NOTE: We cannot use the R print modifier because we convert signed
+     comparison operators to unsigned ones.  */
+  switch (code)
+    {
+    case LT: cmp_opstr = "qblt"; rcmp_opstr = "qbge"; break;
+    case LE: cmp_opstr = "qble"; rcmp_opstr = "qbgt"; break;
+    case GT: cmp_opstr = "qbgt"; rcmp_opstr = "qble"; break;
+    case GE: cmp_opstr = "qbge"; rcmp_opstr = "qblt"; break;
+    default: gcc_unreachable ();
+    }
+
+  /* OP2 is a constant unsigned byte - utilize this info to generate
+     optimized code.  We can "remove half" of the op table above because
+     we know that OP2.b31 = 0 (remember that 0 <= OP2 <= 255).  */
+  if (code == LT || code == LE)
+    {
+      if (is_near)
+	snprintf (buf, sizeof (buf),
+		  "qbbs\t.+8, %%1, %d\n\t"
+		  "%s\t%%l3, %%1, %%2",
+		  regop_sign_bit_pos,
+		  cmp_opstr);
+      else
+	snprintf (buf, sizeof (buf),
+		  "qbbs\t.+12, %%1, %d\n\t"
+		  "%s\t.+8, %%1, %%2\n\t"
+		  "jmp\t%%%%label(%%l3)",
+		  regop_sign_bit_pos,
+		  rcmp_opstr);
+    }
+  else if (code == GT || code == GE)
+    {
+      if (is_near)
+	snprintf (buf, sizeof (buf),
+		  "qbbs\t%%l3, %%1, %d\n\t"
+		  "%s\t%%l3, %%1, %%2",
+		  regop_sign_bit_pos,
+		  cmp_opstr);
+      else
+	snprintf (buf, sizeof (buf),
+		  "qbbs\t.+8, %%1, %d\n\t"
+		  "%s\t.+8, %%1, %%2\n\t"
+		  "jmp\t%%%%label(%%l3)",
+		  regop_sign_bit_pos,
+		  rcmp_opstr);
+    }
+  else
+    gcc_unreachable ();
+
+  return buf;
+}
+
+/*
+   Optimized version of pru_output_signed_cbranch_ubyteop2 for constant
+   zero second operand.  */
+
+const char *
+pru_output_signed_cbranch_zeroop2 (rtx *operands, bool is_near)
+{
+  static char buf[1024];
+  enum rtx_code code = GET_CODE (operands[0]);
+  int regop_sign_bit_pos = sign_bit_position (operands[1]);
+
+  /* OP2 is a constant zero - utilize this info to simply check the
+     OP1 sign bit when comparing for LT or GE.  */
+  if (code == LT)
+    {
+      if (is_near)
+	snprintf (buf, sizeof (buf),
+		  "qbbs\t%%l3, %%1, %d\n\t",
+		  regop_sign_bit_pos);
+      else
+	snprintf (buf, sizeof (buf),
+		  "qbbc\t.+8, %%1, %d\n\t"
+		  "jmp\t%%%%label(%%l3)",
+		  regop_sign_bit_pos);
+    }
+  else if (code == GE)
+    {
+      if (is_near)
+	snprintf (buf, sizeof (buf),
+		  "qbbc\t%%l3, %%1, %d\n\t",
+		  regop_sign_bit_pos);
+      else
+	snprintf (buf, sizeof (buf),
+		  "qbbs\t.+8, %%1, %d\n\t"
+		  "jmp\t%%%%label(%%l3)",
+		  regop_sign_bit_pos);
+    }
+  else
+    gcc_unreachable ();
+
+  return buf;
+}
+
+/* Addressing Modes.  */
+
+/* Return true if register REGNO is a valid base register.
+   STRICT_P is true if REG_OK_STRICT is in effect.  */
+
+bool
+pru_regno_ok_for_base_p (int regno, bool strict_p)
+{
+  if (!HARD_REGISTER_NUM_P (regno))
+    {
+      if (!strict_p)
+	return true;
+
+      if (!reg_renumber)
+	return false;
+
+      regno = reg_renumber[regno];
+    }
+
+  /* The fake registers will be eliminated to either the stack or
+     hard frame pointer, both of which are usually valid base registers.
+     Reload deals with the cases where the eliminated form isn't valid.  */
+  return (GP_REG_P (regno)
+	  || regno == FRAME_POINTER_REGNUM
+	  || regno == ARG_POINTER_REGNUM);
+}
+
+static bool
+pru_valid_const_ubyte_offset (machine_mode mode, HOST_WIDE_INT offset)
+{
+  bool valid = UBYTE_INT (offset);
+
+  /* Reload can split multi word accesses, so make sure we can address
+     the second word in a DI.  */
+  if (valid && GET_MODE_SIZE (mode) > GET_MODE_SIZE (SImode))
+    valid = UBYTE_INT (offset + GET_MODE_SIZE (mode) - 1);
+
+  return valid;
+}
+
+/* Recognize a CTABLE base address.  Return CTABLE entry index, or -1 if
+ * base was not found in the pragma-filled pru_ctable.  */
+int pru_get_ctable_exact_base_index (unsigned HOST_WIDE_INT caddr)
+{
+  unsigned int i;
+
+  for (i = 0; i < ARRAY_SIZE (pru_ctable); i++)
+    {
+      if (pru_ctable[i].valid && pru_ctable[i].base == caddr)
+	return i;
+    }
+  return -1;
+}
+
+
+/* Check if the given address can be addressed via CTABLE_BASE + UBYTE_OFFS,
+   and return the base CTABLE index if possible.  */
+int pru_get_ctable_base_index (unsigned HOST_WIDE_INT caddr)
+{
+  unsigned int i;
+
+  for (i = 0; i < ARRAY_SIZE (pru_ctable); i++)
+    {
+      if (pru_ctable[i].valid && IN_RANGE (caddr,
+					   pru_ctable[i].base,
+					   pru_ctable[i].base + 0xff))
+	return i;
+    }
+  return -1;
+}
+
+
+/* Return the offset from some CTABLE base for this address.  */
+int pru_get_ctable_base_offset (unsigned HOST_WIDE_INT caddr)
+{
+  int i;
+
+  i = pru_get_ctable_base_index (caddr);
+  gcc_assert (i >= 0);
+
+  return caddr - pru_ctable[i].base;
+}
+
+/* Return true if the address expression formed by BASE + OFFSET is
+   valid.  */
+static bool
+pru_valid_addr_expr_p (machine_mode mode, rtx base, rtx offset, bool strict_p)
+{
+  if (!strict_p && base != NULL_RTX && GET_CODE (base) == SUBREG)
+    base = SUBREG_REG (base);
+  if (!strict_p && offset != NULL_RTX && GET_CODE (offset) == SUBREG)
+    offset = SUBREG_REG (offset);
+
+  if (REG_P (base)
+      && pru_regno_ok_for_base_p (REGNO (base), strict_p)
+      && (offset == NULL_RTX
+	  || (CONST_INT_P (offset)
+	      && pru_valid_const_ubyte_offset (mode, INTVAL (offset)))
+	  || (REG_P (offset)
+	      && pru_regno_ok_for_index_p (REGNO (offset), strict_p))))
+    {
+      /*     base register + register offset
+       * OR  base register + UBYTE constant offset.  */
+      return true;
+    }
+  else if (REG_P (base)
+	   && pru_regno_ok_for_index_p (REGNO (base), strict_p)
+	   && (offset != NULL_RTX && ctable_base_operand (offset, VOIDmode)))
+    {
+      /*     base CTABLE constant base + register offset
+       * Note: GCC always puts the register as a first operand of PLUS.  */
+      return true;
+    }
+  else if (CONST_INT_P (base)
+	   && offset == NULL_RTX
+	   && (ctable_addr_operand (base, VOIDmode)))
+    {
+      /*     base CTABLE constant base + UBYTE constant offset.  */
+      return true;
+    }
+  else
+    {
+      return false;
+    }
+}
+
+/* Implement TARGET_LEGITIMATE_ADDRESS_P.  */
+static bool
+pru_legitimate_address_p (machine_mode mode,
+			    rtx operand, bool strict_p)
+{
+  switch (GET_CODE (operand))
+    {
+      /* Direct.  */
+    case SYMBOL_REF:
+    case LABEL_REF:
+    case CONST:
+    case CONST_DOUBLE:
+      return false;
+
+    case CONST_INT:
+      return ctable_addr_operand (operand, VOIDmode);
+
+      /* Register indirect.  */
+    case REG:
+      return pru_regno_ok_for_base_p (REGNO (operand), strict_p);
+
+      /* Register indirect with displacement.  */
+    case PLUS:
+	{
+	  rtx op0 = XEXP (operand, 0);
+	  rtx op1 = XEXP (operand, 1);
+
+	  return (pru_valid_addr_expr_p (mode, op0, op1, strict_p)
+		  || pru_valid_addr_expr_p (mode, op1, op0, strict_p));
+	}
+
+    default:
+      break;
+    }
+  return false;
+}
+
+/* Output assembly language related definitions.  */
+
+static void
+pru_elf_asm_constructor (rtx symbol, int priority)
+{
+  char buf[23];
+  section *s;
+
+  if (priority == DEFAULT_INIT_PRIORITY)
+    snprintf (buf, sizeof (buf), ".init_array");
+  else
+    {
+      /* While priority is known to be in range [0, 65535], so 18 bytes
+	 would be enough, the compiler might not know that.  To avoid
+	 -Wformat-truncation false positive, use a larger size.  */
+      snprintf (buf, sizeof (buf), ".init_array.%.5u", priority);
+    }
+  s = get_section (buf, SECTION_WRITE | SECTION_NOTYPE, NULL);
+  switch_to_section (s);
+  assemble_aligned_integer (INIT_ARRAY_ENTRY_BYTES, symbol);
+}
+
+static void
+pru_elf_asm_destructor (rtx symbol, int priority)
+{
+  char buf[23];
+  section *s;
+
+  if (priority == DEFAULT_INIT_PRIORITY)
+    snprintf (buf, sizeof (buf), ".fini_array");
+  else
+    {
+      /* While priority is known to be in range [0, 65535], so 18 bytes
+	 would be enough, the compiler might not know that.  To avoid
+	 -Wformat-truncation false positive, use a larger size.  */
+      snprintf (buf, sizeof (buf), ".fini_array.%.5u", priority);
+    }
+  s = get_section (buf, SECTION_WRITE | SECTION_NOTYPE, NULL);
+  switch_to_section (s);
+  assemble_aligned_integer (INIT_ARRAY_ENTRY_BYTES, symbol);
+}
+
+static const char *
+pru_comparison_str (enum rtx_code cond)
+{
+  switch (cond)
+    {
+    case NE:  return "ne";
+    case EQ:  return "eq";
+    case GEU: return "ge";
+    case GTU: return "gt";
+    case LEU: return "le";
+    case LTU: return "lt";
+    default: gcc_unreachable ();
+    }
+}
+
+/* Access some RTX as INT_MODE.  If X is a CONST_FIXED we can get
+   the bit representation of X by "casting" it to CONST_INT.  */
+
+static rtx
+pru_to_int_mode (rtx x)
+{
+  machine_mode mode = GET_MODE (x);
+
+  return VOIDmode == mode
+    ? x
+    : simplify_gen_subreg (int_mode_for_mode (mode).require (), x, mode, 0);
+}
+
+/* Translate between the MachineDescription notion
+   of 8-bit consecutive registers, to the PRU
+   assembler syntax of REGWORD[.SUBREG].  */
+static const char *
+pru_asm_regname (rtx op)
+{
+  static char canon_reg_names[3][LAST_GP_REG][8];
+  int speci, regi;
+
+  gcc_assert (REG_P (op));
+
+  if (!canon_reg_names[0][0][0])
+    {
+      for (regi = 0; regi < LAST_GP_REG; regi++)
+	for (speci = 0; speci < 3; speci++)
+	  {
+	    const int sz = (speci == 0) ? 1 : ((speci == 1) ? 2 : 4);
+	    if ((regi + sz) > (32 * 4))
+	      continue;	/* Invalid entry.  */
+
+	    /* Construct the lookup table.  */
+	    const char *suffix = "";
+
+	    switch ((sz << 8) | (regi % 4))
+	      {
+	      case (1 << 8) | 0: suffix = ".b0"; break;
+	      case (1 << 8) | 1: suffix = ".b1"; break;
+	      case (1 << 8) | 2: suffix = ".b2"; break;
+	      case (1 << 8) | 3: suffix = ".b3"; break;
+	      case (2 << 8) | 0: suffix = ".w0"; break;
+	      case (2 << 8) | 1: suffix = ".w1"; break;
+	      case (2 << 8) | 2: suffix = ".w2"; break;
+	      case (4 << 8) | 0: suffix = ""; break;
+	      default:
+		/* Invalid entry.  */
+		continue;
+	      }
+	    sprintf (&canon_reg_names[speci][regi][0],
+		     "r%d%s", regi / 4, suffix);
+	  }
+    }
+
+  switch (GET_MODE_SIZE (GET_MODE (op)))
+    {
+    case 1: speci = 0; break;
+    case 2: speci = 1; break;
+    case 4: speci = 2; break;
+    case 8: speci = 2; break; /* Existing GCC test cases are not using %F.  */
+    default: gcc_unreachable ();
+    }
+  regi = REGNO (op);
+  gcc_assert (regi < LAST_GP_REG);
+  gcc_assert (canon_reg_names[speci][regi][0]);
+
+  return &canon_reg_names[speci][regi][0];
+}
+
+/* Print the operand OP to file stream FILE modified by LETTER.
+   LETTER can be one of:
+
+     b: prints the register byte start (used by LBBO/SBBO)
+     B: prints 'c' or 'b' for CTABLE or REG base in a memory address
+     F: Full 32-bit register.
+     H: Higher 16-bits of a const_int operand
+     L: Lower 16-bits of a const_int operand
+     N: prints next 32-bit register (upper 32bits of a 64bit REG couple)
+     P: prints swapped condition.
+     Q: prints swapped and reversed condition.
+     R: prints reversed condition.
+     S: print operand mode size (but do not print the operand itself)
+     T: print exact_log2 () for const_int operands
+     V: print exact_log2 () of negated const_int operands
+     w: Lower 32-bits of a const_int operand
+     W: Upper 32-bits of a const_int operand
+     y: print the next 8-bit register (regardless of op size)
+     z: print the second next 8-bit register (regardless of op size)
+*/
+static void
+pru_print_operand (FILE *file, rtx op, int letter)
+{
+
+  switch (letter)
+    {
+    case 'S':
+      fprintf (file, "%d", GET_MODE_SIZE (GET_MODE (op)));
+      return;
+
+    default:
+      break;
+    }
+
+  if (comparison_operator (op, VOIDmode))
+    {
+      enum rtx_code cond = GET_CODE (op);
+      gcc_assert (!pru_signed_cmp_operator (op, VOIDmode));
+
+      switch (letter)
+	{
+	case 0:
+	  fprintf (file, "%s", pru_comparison_str (cond));
+	  return;
+	case 'P':
+	  fprintf (file, "%s", pru_comparison_str (swap_condition (cond)));
+	  return;
+	case 'Q':
+	  cond = swap_condition (cond);
+	  /* Fall through to reverse.  */
+	case 'R':
+	  fprintf (file, "%s", pru_comparison_str (reverse_condition (cond)));
+	  return;
+	}
+    }
+
+  switch (GET_CODE (op))
+    {
+    case REG:
+      if (letter == 0)
+	{
+	  fprintf (file, "%s", pru_asm_regname (op));
+	  return;
+	}
+      else if (letter == 'b')
+	{
+	  gcc_assert (REGNO (op) <= LAST_NONIO_GP_REG);
+	  fprintf (file, "r%d.b%d", REGNO (op) / 4, REGNO (op) % 4);
+	  return;
+	}
+      else if (letter == 'F')
+	{
+	  gcc_assert (REGNO (op) <= LAST_NONIO_GP_REG);
+	  gcc_assert (REGNO (op) % 4 == 0);
+	  fprintf (file, "r%d", REGNO (op) / 4);
+	  return;
+	}
+      else if (letter == 'N')
+	{
+	  gcc_assert (REGNO (op) <= LAST_NONIO_GP_REG);
+	  gcc_assert (REGNO (op) % 4 == 0);
+	  fprintf (file, "r%d", REGNO (op) / 4 + 1);
+	  return;
+	}
+      else if (letter == 'y')
+	{
+	  gcc_assert (REGNO (op) <= LAST_NONIO_GP_REG - 1);
+	  fprintf (file, "%s", reg_names[REGNO (op) + 1]);
+	  return;
+	}
+      else if (letter == 'z')
+	{
+	  gcc_assert (REGNO (op) <= LAST_NONIO_GP_REG - 2);
+	  fprintf (file, "%s", reg_names[REGNO (op) + 2]);
+	  return;
+	}
+      break;
+
+    case CONST_INT:
+      if (letter == 'H')
+	{
+	  HOST_WIDE_INT val = INTVAL (op);
+	  val = (val >> 16) & 0xFFFF;
+	  output_addr_const (file, gen_int_mode (val, SImode));
+	  return;
+	}
+      else if (letter == 'L')
+	{
+	  HOST_WIDE_INT val = INTVAL (op);
+	  val &= 0xFFFF;
+	  output_addr_const (file, gen_int_mode (val, SImode));
+	  return;
+	}
+      else if (letter == 'T')
+	{
+	  /* The predicate should have already validated the 1-high-bit
+	     requirement.  Use CTZ here to deal with constant's sign
+	     extension.  */
+	  HOST_WIDE_INT val = wi::ctz (INTVAL (op));
+	  gcc_assert (val >= 0 && val <= 31);
+	  output_addr_const (file, gen_int_mode (val, SImode));
+	  return;
+	}
+      else if (letter == 'V')
+	{
+	  HOST_WIDE_INT val = wi::ctz (~INTVAL (op));
+	  gcc_assert (val >= 0 && val <= 31);
+	  output_addr_const (file, gen_int_mode (val, SImode));
+	  return;
+	}
+      else if (letter == 'w')
+	{
+	  HOST_WIDE_INT val = INTVAL (op) & 0xffffffff;
+	  output_addr_const (file, gen_int_mode (val, SImode));
+	  return;
+	}
+      else if (letter == 'W')
+	{
+	  HOST_WIDE_INT val = (INTVAL (op) >> 32) & 0xffffffff;
+	  output_addr_const (file, gen_int_mode (val, SImode));
+	  return;
+	}
+      /* Else, fall through.  */
+
+    case CONST:
+    case LABEL_REF:
+    case SYMBOL_REF:
+      if (letter == 0)
+	{
+	  output_addr_const (file, op);
+	  return;
+	}
+      break;
+
+    case CONST_FIXED:
+	{
+	  HOST_WIDE_INT ival = INTVAL (pru_to_int_mode (op));
+	  if (letter != 0)
+	    output_operand_lossage ("Unsupported code '%c' for fixed-point:",
+				    letter);
+	  fprintf (file, HOST_WIDE_INT_PRINT_DEC, ival);
+	  return;
+	}
+      break;
+
+    case CONST_DOUBLE:
+      if (letter == 0)
+	{
+	  long val;
+
+	  if (GET_MODE (op) != SFmode)
+	    fatal_insn ("internal compiler error.  Unknown mode:", op);
+	  REAL_VALUE_TO_TARGET_SINGLE (*CONST_DOUBLE_REAL_VALUE (op), val);
+	  fprintf (file, "0x%lx", val);
+	  return;
+	}
+      else if (letter == 'w' || letter == 'W')
+	{
+	  long t[2];
+	  REAL_VALUE_TO_TARGET_DOUBLE (*CONST_DOUBLE_REAL_VALUE (op), t);
+	  fprintf (file, "0x%lx", t[letter == 'w' ? 0 : 1]);
+	  return;
+	}
+      else
+	{
+	  gcc_unreachable ();
+	}
+      break;
+
+    case SUBREG:
+    case MEM:
+      if (letter == 0)
+	{
+	  output_address (VOIDmode, op);
+	  return;
+	}
+      else if (letter == 'B')
+	{
+	  rtx base = XEXP (op, 0);
+	  if (GET_CODE (base) == PLUS)
+	    {
+	      rtx op0 = XEXP (base, 0);
+	      rtx op1 = XEXP (base, 1);
+
+	      /* PLUS cannot have two constant operands, so one
+		 of them must be a REG, hence we must check for an
+		 exact base address.  */
+	      if (ctable_base_operand (op0, VOIDmode)
+		  || ctable_base_operand (op1, VOIDmode))
+		{
+		  fprintf (file, "c");
+		  return;
+		}
+	      else if (REG_P (op0) || REG_P (op1))
+		{
+		  fprintf (file, "b");
+		  return;
+		}
+	      else
+		gcc_unreachable ();
+	    }
+	  else if (REG_P (base))
+	    {
+	      fprintf (file, "b");
+	      return;
+	    }
+	  else if (ctable_addr_operand (base, VOIDmode))
+	    {
+	      fprintf (file, "c");
+	      return;
+	    }
+	  else
+	    gcc_unreachable ();
+	}
+      break;
+
+    case CODE_LABEL:
+      if (letter == 0)
+	{
+	  output_addr_const (file, op);
+	  return;
+	}
+      break;
+
+    default:
+      break;
+    }
+
+  output_operand_lossage ("Unsupported operand %s for code '%c'",
+			  GET_RTX_NAME (GET_CODE (op)), letter);
+  gcc_unreachable ();
+}
+
+/* Implement TARGET_PRINT_OPERAND_ADDRESS.  */
+static void
+pru_print_operand_address (FILE *file, machine_mode mode, rtx op)
+{
+  if (GET_CODE (op) != REG && CONSTANT_ADDRESS_P (op)
+      && text_segment_operand (op, VOIDmode))
+    {
+      fprintf (stderr, "Unexpectred text address?\n");
+      debug_rtx (op);
+      gcc_unreachable ();
+    }
+
+  switch (GET_CODE (op))
+    {
+    case CONST:
+    case LABEL_REF:
+    case CONST_DOUBLE:
+    case SYMBOL_REF:
+      break;
+
+    case CONST_INT:
+      {
+	unsigned HOST_WIDE_INT caddr = INTVAL (op);
+	int base = pru_get_ctable_base_index (caddr);
+	int offs = pru_get_ctable_base_offset (caddr);
+	gcc_assert (base >= 0);
+	fprintf (file, "%d, %d", base, offs);
+	return;
+      }
+      break;
+
+    case PLUS:
+      {
+	int base;
+	rtx op0 = XEXP (op, 0);
+	rtx op1 = XEXP (op, 1);
+
+	if (REG_P (op0) && CONST_INT_P (op1)
+	    && pru_get_ctable_exact_base_index (INTVAL (op1)) >= 0)
+	  {
+	    base = pru_get_ctable_exact_base_index (INTVAL (op1));
+	    fprintf (file, "%d, %s", base, pru_asm_regname (op0));
+	    return;
+	  }
+	else if (REG_P (op1) && CONST_INT_P (op0)
+		 && pru_get_ctable_exact_base_index (INTVAL (op0)) >= 0)
+	  {
+	    base = pru_get_ctable_exact_base_index (INTVAL (op0));
+	    fprintf (file, "%d, %s", base, pru_asm_regname (op1));
+	    return;
+	  }
+	else if (REG_P (op0) && CONSTANT_P (op1))
+	  {
+	    fprintf (file, "%s, ", pru_asm_regname (op0));
+	    output_addr_const (file, op1);
+	    return;
+	  }
+	else if (REG_P (op1) && CONSTANT_P (op0))
+	  {
+	    fprintf (file, "%s, ", pru_asm_regname (op1));
+	    output_addr_const (file, op0);
+	    return;
+	  }
+	else if (REG_P (op1) && REG_P (op0))
+	  {
+	    fprintf (file, "%s, %s", pru_asm_regname (op0),
+				     pru_asm_regname (op1));
+	    return;
+	  }
+      }
+      break;
+
+    case REG:
+      fprintf (file, "%s, 0", pru_asm_regname (op));
+      return;
+
+    case MEM:
+      {
+	rtx base = XEXP (op, 0);
+	pru_print_operand_address (file, mode, base);
+	return;
+      }
+    default:
+      break;
+    }
+
+  fprintf (stderr, "Missing way to print address\n");
+  debug_rtx (op);
+  gcc_unreachable ();
+}
+
+/* Implement TARGET_ASM_FUNCTION_PROLOGUE.  */
+static void
+pru_asm_function_prologue (FILE *file)
+{
+  if (flag_verbose_asm || flag_debug_asm)
+    {
+      pru_compute_frame_layout ();
+      pru_dump_frame_layout (file);
+    }
+}
+
+/* Implement `TARGET_ASM_INTEGER'.  */
+/* Target hook for assembling integer objects.  PRU version needs
+   special handling for references to pmem.  Code copied from AVR.  */
+
+static bool
+pru_assemble_integer (rtx x, unsigned int size, int aligned_p)
+{
+  if (size == POINTER_SIZE / BITS_PER_UNIT
+      && aligned_p
+      && text_segment_operand (x, VOIDmode))
+    {
+      fputs ("\t.4byte\t%pmem(", asm_out_file);
+      output_addr_const (asm_out_file, x);
+      fputs (")\n", asm_out_file);
+
+      return true;
+    }
+  else if (size == INIT_ARRAY_ENTRY_BYTES
+	   && aligned_p
+	   && text_segment_operand (x, VOIDmode))
+    {
+      fputs ("\t.2byte\t%pmem(", asm_out_file);
+      output_addr_const (asm_out_file, x);
+      fputs (")\n", asm_out_file);
+
+      return true;
+    }
+  else
+    {
+      return default_assemble_integer (x, size, aligned_p);
+    }
+}
+
+/* Implement TARGET_ASM_FILETARGET_ASM_FILE_START_START.  */
+
+static void
+pru_file_start (void)
+{
+  default_file_start ();
+
+  /* Compiler will take care of placing %label, so there is no
+     need to confuse users with this warning.  */
+  fprintf (asm_out_file, "\t.set no_warn_regname_label\n");
+}
+
+/* Function argument related.  */
+
+static int
+pru_function_arg_size (machine_mode mode, const_tree type)
+{
+  HOST_WIDE_INT param_size;
+
+  if (mode == BLKmode)
+    param_size = int_size_in_bytes (type);
+  else
+    param_size = GET_MODE_SIZE (mode);
+
+  /* Convert to words (round up).  */
+  param_size = (UNITS_PER_WORD - 1 + param_size) / UNITS_PER_WORD;
+  gcc_assert (param_size >= 0);
+
+  return param_size;
+}
+
+/* Check if argument with the given size must be
+   passed/returned in a register.
+
+   Reference:
+   https://e2e.ti.com/support/development_tools/compiler/f/343/p/650176/2393029
+
+   Arguments other than 8/16/24/32/64bits are passed on stack.  */
+static bool
+pru_arg_in_reg_bysize (size_t sz)
+{
+  return sz == 1 || sz == 2 || sz == 3 || sz == 4 || sz == 8;
+}
+
+static int
+pru_function_arg_regi (cumulative_args_t cum_v,
+		       machine_mode mode, const_tree type,
+		       bool named)
+{
+  CUMULATIVE_ARGS *cum = get_cumulative_args (cum_v);
+  size_t argsize = pru_function_arg_size (mode, type);
+  size_t i, bi;
+  int regi = -1;
+
+  if (!pru_arg_in_reg_bysize (argsize))
+    return -1;
+
+  if (!named)
+    return -1;
+
+  /* Find the first available slot that fits.  Yes, that's the PRU ABI.  */
+  for (i = 0; regi < 0 && i < ARRAY_SIZE (cum->regs_used); i++)
+    {
+      if (mode == BLKmode)
+	{
+	  /* Structs are passed, beginning at a full register.  */
+	  if ((i % 4) != 0)
+	    continue;
+	}
+      else
+	{
+	  /* Scalar arguments.  */
+
+	  /* Ensure SI and DI arguments are stored in full registers only.  */
+	  if ((argsize >= 4) && (i % 4) != 0)
+	    continue;
+
+	  /* rX.w0/w1/w2 are OK.  But avoid spreading the second byte
+	     into a different full register.  */
+	  if (argsize == 2 && (i % 4) == 3)
+	    continue;
+	}
+
+      for (bi = 0;
+	   bi < argsize && (bi + i) < ARRAY_SIZE (cum->regs_used);
+	   bi++)
+	{
+	  if (cum->regs_used[bi + i])
+	    break;
+	}
+      if (bi == argsize)
+	regi = FIRST_ARG_REGNO + i;
+    }
+
+  return regi;
+}
+
+static void
+pru_function_arg_regi_mark_slot (int regi,
+				 cumulative_args_t cum_v,
+				 machine_mode mode, const_tree type,
+				 bool named)
+{
+  CUMULATIVE_ARGS *cum = get_cumulative_args (cum_v);
+  HOST_WIDE_INT param_size = pru_function_arg_size (mode, type);
+
+  gcc_assert (named);
+
+  /* Mark all byte sub-registers occupied by argument as used.  */
+  while (param_size--)
+    {
+      gcc_assert (regi >= FIRST_ARG_REGNO && regi <= LAST_ARG_REGNO);
+      gcc_assert (!cum->regs_used[regi - FIRST_ARG_REGNO]);
+      cum->regs_used[regi - FIRST_ARG_REGNO] = true;
+      regi++;
+    }
+}
+
+/* Define where to put the arguments to a function.  Value is zero to
+   push the argument on the stack, or a hard register in which to
+   store the argument.
+
+   MODE is the argument's machine mode.
+   TYPE is the data type of the argument (as a tree).
+   This is null for libcalls where that information may
+   not be available.
+   CUM is a variable of type CUMULATIVE_ARGS which gives info about
+   the preceding args and about the function being called.
+   NAMED is nonzero if this argument is a named parameter
+   (otherwise it is an extra parameter matching an ellipsis).  */
+
+static rtx
+pru_function_arg (cumulative_args_t cum_v, machine_mode mode,
+		    const_tree type,
+		    bool named)
+{
+  rtx return_rtx = NULL_RTX;
+  int regi = pru_function_arg_regi (cum_v, mode, type, named);
+
+  if (regi >= 0)
+    return_rtx = gen_rtx_REG (mode, regi);
+
+  return return_rtx;
+}
+
+/* Return number of bytes, at the beginning of the argument, that must be
+   put in registers.  0 is the argument is entirely in registers or entirely
+   in memory.  */
+
+static int
+pru_arg_partial_bytes (cumulative_args_t cum_v ATTRIBUTE_UNUSED,
+		       machine_mode mode ATTRIBUTE_UNUSED,
+		       tree type ATTRIBUTE_UNUSED,
+		       bool named ATTRIBUTE_UNUSED)
+{
+  return 0;
+}
+
+/* Update the data in CUM to advance over an argument of mode MODE
+   and data type TYPE; TYPE is null for libcalls where that information
+   may not be available.  */
+
+static void
+pru_function_arg_advance (cumulative_args_t cum_v, machine_mode mode,
+			    const_tree type,
+			    bool named)
+{
+  int regi = pru_function_arg_regi (cum_v, mode, type, named);
+
+  if (regi >= 0)
+    pru_function_arg_regi_mark_slot (regi, cum_v, mode, type, named);
+}
+
+/* Implement TARGET_FUNCTION_VALUE.  */
+static rtx
+pru_function_value (const_tree ret_type, const_tree fn ATTRIBUTE_UNUSED,
+		      bool outgoing ATTRIBUTE_UNUSED)
+{
+  return gen_rtx_REG (TYPE_MODE (ret_type), FIRST_RETVAL_REGNO);
+}
+
+/* Implement TARGET_LIBCALL_VALUE.  */
+static rtx
+pru_libcall_value (machine_mode mode, const_rtx fun ATTRIBUTE_UNUSED)
+{
+  return gen_rtx_REG (mode, FIRST_RETVAL_REGNO);
+}
+
+/* Implement TARGET_FUNCTION_VALUE_REGNO_P.  */
+static bool
+pru_function_value_regno_p (const unsigned int regno)
+{
+  return regno == FIRST_RETVAL_REGNO;
+}
+
+/* Implement TARGET_RETURN_IN_MEMORY.  */
+bool
+pru_return_in_memory (const_tree type, const_tree fntype ATTRIBUTE_UNUSED)
+{
+  bool in_memory = (!pru_arg_in_reg_bysize (int_size_in_bytes (type))
+		    || int_size_in_bytes (type) == -1);
+
+  return in_memory;
+}
+
+/* Implement TARGET_CAN_USE_DOLOOP_P.  */
+
+static bool
+pru_can_use_doloop_p (const widest_int &, const widest_int &iterations_max,
+		      unsigned int loop_depth, bool)
+{
+  /* Considering limitations in the hardware, only use doloop
+     for innermost loops which must be entered from the top.  */
+  if (loop_depth > 1)
+    return false;
+  /* PRU internal loop counter is 16bits wide.  Remember that iterations_max
+     holds the maximum number of loop latch executions, while PRU loop
+     instruction needs the count of loop body executions.  */
+  if (iterations_max == 0 || wi::geu_p (iterations_max, 0xffff))
+    return false;
+
+  return true;
+}
+
+/* NULL if INSN insn is valid within a low-overhead loop.
+   Otherwise return why doloop cannot be applied.  */
+
+static const char *
+pru_invalid_within_doloop (const rtx_insn *insn)
+{
+  if (CALL_P (insn))
+    return "Function call in the loop.";
+
+  if (JUMP_P (insn) && INSN_CODE (insn) == CODE_FOR_return)
+    return "Return from a call instruction in the loop.";
+
+  if (NONDEBUG_INSN_P (insn)
+      && INSN_CODE (insn) < 0
+      && (GET_CODE (PATTERN (insn)) == ASM_INPUT
+	  || asm_noperands (PATTERN (insn)) >= 0))
+    return "Loop contains asm statement.";
+
+  return NULL;
+}
+
+
+/* Figure out where to put LABEL, which is the label for a repeat loop.
+   The loop ends just before LAST_INSN.  If SHARED, insns other than the
+   "repeat" might use LABEL to jump to the loop's continuation point.
+
+   Return the last instruction in the adjusted loop.  */
+
+static rtx_insn *
+pru_insert_loop_label_last (rtx_insn *last_insn, rtx_code_label *label,
+			    bool shared)
+{
+  rtx_insn *next, *prev;
+  int count = 0, code, icode;
+
+  if (dump_file)
+    fprintf (dump_file, "considering end of repeat loop at insn %d\n",
+	     INSN_UID (last_insn));
+
+  /* Set PREV to the last insn in the loop.  */
+  prev = PREV_INSN (last_insn);
+
+  /* Set NEXT to the next insn after the loop label.  */
+  next = last_insn;
+  if (!shared)
+    while (prev != 0)
+      {
+	code = GET_CODE (prev);
+	if (code == CALL_INSN || code == CODE_LABEL || code == BARRIER)
+	  break;
+
+	if (INSN_P (prev))
+	  {
+	    if (GET_CODE (PATTERN (prev)) == SEQUENCE)
+	      prev = as_a <rtx_insn *> (XVECEXP (PATTERN (prev), 0, 1));
+
+	    /* Other insns that should not be in the last two opcodes.  */
+	    icode = recog_memoized (prev);
+	    if (icode < 0
+		|| icode == CODE_FOR_pruloophi
+		|| icode == CODE_FOR_pruloopsi)
+	      break;
+
+	    count++;
+	    next = prev;
+	    if (dump_file)
+	      print_rtl_single (dump_file, next);
+	    if (count == 2)
+	      break;
+	  }
+	prev = PREV_INSN (prev);
+      }
+
+  /* Insert the nops.  */
+  if (dump_file && count < 2)
+    fprintf (dump_file, "Adding %d nop%s inside loop\n\n",
+	     2 - count, count == 1 ? "" : "s");
+
+  for (; count < 2; count++)
+      emit_insn_before (gen_nop (), last_insn);
+
+  /* Insert the label.  */
+  emit_label_before (label, last_insn);
+
+  return last_insn;
+}
+
+
+void
+pru_emit_doloop (rtx *operands, int is_end)
+{
+  rtx tag;
+
+  if (cfun->machine->doloop_tags == 0
+      || cfun->machine->doloop_tag_from_end == is_end)
+    {
+      cfun->machine->doloop_tags++;
+      cfun->machine->doloop_tag_from_end = is_end;
+    }
+
+  tag = GEN_INT (cfun->machine->doloop_tags - 1);
+  machine_mode opmode = GET_MODE (operands[0]);
+  if (is_end)
+    {
+      if (opmode == HImode)
+	emit_jump_insn (gen_doloop_end_internalhi (operands[0],
+						   operands[1], tag));
+      else if (opmode == SImode)
+	emit_jump_insn (gen_doloop_end_internalsi (operands[0],
+						   operands[1], tag));
+      else
+	gcc_unreachable ();
+    }
+  else
+    {
+      if (opmode == HImode)
+	emit_insn (gen_doloop_begin_internalhi (operands[0], operands[0], tag));
+      else if (opmode == SImode)
+	emit_insn (gen_doloop_begin_internalsi (operands[0], operands[0], tag));
+      else
+	gcc_unreachable ();
+    }
+}
+
+
+/* Code for converting doloop_begins and doloop_ends into valid
+   PRU instructions.  Idea and code snippets borrowed from mep port.
+
+   A doloop_begin is just a placeholder:
+
+	$count = unspec ($count)
+
+   where $count is initially the number of iterations.
+   doloop_end has the form:
+
+	if (--$count == 0) goto label
+
+   The counter variable is private to the doloop insns, nothing else
+   relies on its value.
+
+   There are three cases, in decreasing order of preference:
+
+      1.  A loop has exactly one doloop_begin and one doloop_end.
+	 The doloop_end branches to the first instruction after
+	 the doloop_begin.
+
+	 In this case we can replace the doloop_begin with a LOOP
+	 instruction and remove the doloop_end.  I.e.:
+
+		$count1 = unspec ($count1)
+	    label:
+		...
+		if (--$count2 != 0) goto label
+
+	  becomes:
+
+		LOOP end_label,$count1
+	    label:
+		...
+	    end_label:
+		# end loop
+
+      2.  As for (1), except there are several doloop_ends.  One of them
+	 (call it X) falls through to a label L.  All the others fall
+	 through to branches to L.
+
+	 In this case, we remove X and replace the other doloop_ends
+	 with branches to the LOOP label.  For example:
+
+		$count1 = unspec ($count1)
+	    label:
+		...
+		if (--$count1 != 0) goto label
+	    end_label:
+		...
+		if (--$count2 != 0) goto label
+		goto end_label
+
+	 becomes:
+
+		LOOP end_label,$count1
+	    label:
+		...
+	    end_label:
+		# end repeat
+		...
+		goto end_label
+
+      3.  The fallback case.  Replace doloop_begins with:
+
+		$count = $count
+
+	 Replace doloop_ends with the equivalent of:
+
+		$count = $count - 1
+		if ($count != 0) goto loop_label
+
+	 */
+
+/* A structure describing one doloop_begin.  */
+struct pru_doloop_begin {
+  /* The next doloop_begin with the same tag.  */
+  struct pru_doloop_begin *next;
+
+  /* The instruction itself.  */
+  rtx_insn *insn;
+
+  /* The initial counter value.  */
+  rtx loop_count;
+
+  /* The counter register.  */
+  rtx counter;
+};
+
+/* A structure describing a doloop_end.  */
+struct pru_doloop_end {
+  /* The next doloop_end with the same loop tag.  */
+  struct pru_doloop_end *next;
+
+  /* The instruction itself.  */
+  rtx_insn *insn;
+
+  /* The first instruction after INSN when the branch isn't taken.  */
+  rtx_insn *fallthrough;
+
+  /* The location of the counter value.  Since doloop_end_internal is a
+     jump instruction, it has to allow the counter to be stored anywhere
+     (any non-fixed register).  */
+  rtx counter;
+
+  /* The target label (the place where the insn branches when the counter
+     isn't zero).  */
+  rtx label;
+
+  /* A scratch register.  Only available when COUNTER isn't stored
+     in a general register.  */
+  rtx scratch;
+};
+
+
+/* One do-while loop.  */
+struct pru_doloop {
+  /* All the doloop_begins for this loop (in no particular order).  */
+  struct pru_doloop_begin *begin;
+
+  /* All the doloop_ends.  When there is more than one, arrange things
+     so that the first one is the most likely to be X in case (2) above.  */
+  struct pru_doloop_end *end;
+};
+
+
+/* Return true if LOOP can be converted into LOOP form
+   (that is, if it matches cases (1) or (2) above).  */
+
+static bool
+pru_repeat_loop_p (struct pru_doloop *loop)
+{
+  struct pru_doloop_end *end;
+  rtx fallthrough;
+
+  /* There must be exactly one doloop_begin and at least one doloop_end.  */
+  if (loop->begin == 0 || loop->end == 0 || loop->begin->next != 0)
+    return false;
+
+  /* The first doloop_end (X) must branch back to the insn after
+     the doloop_begin.  */
+  if (prev_real_insn (as_a<rtx_insn *> (loop->end->label)) != loop->begin->insn)
+    return false;
+
+  /* Check that the first doloop_end (X) can actually reach
+     doloop_begin () with U8_PCREL relocation for LOOP instruction.  */
+  if (get_attr_length (loop->end->insn) != 4)
+    return false;
+
+  /* All the other doloop_ends must branch to the same place as X.
+     When the branch isn't taken, they must jump to the instruction
+     after X.  */
+  fallthrough = loop->end->fallthrough;
+  for (end = loop->end->next; end != 0; end = end->next)
+    if (end->label != loop->end->label
+	|| !simplejump_p (end->fallthrough)
+	|| next_real_insn (JUMP_LABEL (end->fallthrough)) != fallthrough)
+      return false;
+
+  return true;
+}
+
+
+/* The main repeat reorg function.  See comment above for details.  */
+
+static void
+pru_reorg_loop (rtx_insn *insns)
+{
+  rtx_insn *insn;
+  struct pru_doloop *loops, *loop;
+  struct pru_doloop_begin *begin;
+  struct pru_doloop_end *end;
+  size_t tmpsz;
+
+  /* Quick exit if we haven't created any loops.  */
+  if (cfun->machine->doloop_tags == 0)
+    return;
+
+  /* Create an array of pru_doloop structures.  */
+  tmpsz = sizeof (loops[0]) * cfun->machine->doloop_tags;
+  loops = (struct pru_doloop *) alloca (tmpsz);
+  memset (loops, 0, sizeof (loops[0]) * cfun->machine->doloop_tags);
+
+  /* Search the function for do-while insns and group them by loop tag.  */
+  for (insn = insns; insn; insn = NEXT_INSN (insn))
+    if (INSN_P (insn))
+      switch (recog_memoized (insn))
+	{
+	case CODE_FOR_doloop_begin_internalhi:
+	case CODE_FOR_doloop_begin_internalsi:
+	  insn_extract (insn);
+	  loop = &loops[INTVAL (recog_data.operand[2])];
+
+	  tmpsz = sizeof (struct pru_doloop_begin);
+	  begin = (struct pru_doloop_begin *) alloca (tmpsz);
+	  begin->next = loop->begin;
+	  begin->insn = insn;
+	  begin->loop_count = recog_data.operand[1];
+	  begin->counter = recog_data.operand[0];
+
+	  loop->begin = begin;
+	  break;
+
+	case CODE_FOR_doloop_end_internalhi:
+	case CODE_FOR_doloop_end_internalsi:
+	  insn_extract (insn);
+	  loop = &loops[INTVAL (recog_data.operand[2])];
+
+	  tmpsz = sizeof (struct pru_doloop_end);
+	  end = (struct pru_doloop_end *) alloca (tmpsz);
+	  end->insn = insn;
+	  end->fallthrough = next_real_insn (insn);
+	  end->counter = recog_data.operand[0];
+	  end->label = recog_data.operand[1];
+	  end->scratch = recog_data.operand[3];
+
+	  /* If this insn falls through to an unconditional jump,
+	     give it a lower priority than the others.  */
+	  if (loop->end != 0 && simplejump_p (end->fallthrough))
+	    {
+	      end->next = loop->end->next;
+	      loop->end->next = end;
+	    }
+	  else
+	    {
+	      end->next = loop->end;
+	      loop->end = end;
+	    }
+	  break;
+	}
+
+  /* Convert the insns for each loop in turn.  */
+  for (loop = loops; loop < loops + cfun->machine->doloop_tags; loop++)
+    if (pru_repeat_loop_p (loop))
+      {
+	/* Case (1) or (2).  */
+	rtx_code_label *repeat_label;
+	rtx label_ref;
+
+	/* Create a new label for the repeat insn.  */
+	repeat_label = gen_label_rtx ();
+
+	/* Replace the doloop_begin with a repeat.  We get rid
+	   of the iteration register because LOOP instruction
+	   will utilize an internal for the PRU core LOOP register.  */
+	label_ref = gen_rtx_LABEL_REF (VOIDmode, repeat_label);
+	machine_mode loop_mode = GET_MODE (loop->begin->loop_count);
+	if (loop_mode == HImode)
+	  emit_insn_before (gen_pruloophi (loop->begin->loop_count, label_ref),
+			    loop->begin->insn);
+	else if (loop_mode == SImode)
+	  {
+	    rtx loop_rtx = gen_pruloopsi (loop->begin->loop_count, label_ref);
+	    emit_insn_before (loop_rtx, loop->begin->insn);
+	  }
+	else if (loop_mode == VOIDmode)
+	  {
+	    gcc_assert (CONST_INT_P (loop->begin->loop_count));
+	    gcc_assert (UBYTE_INT ( INTVAL (loop->begin->loop_count)));
+	    rtx loop_rtx = gen_pruloopsi (loop->begin->loop_count, label_ref);
+	    emit_insn_before (loop_rtx, loop->begin->insn);
+	  }
+	else
+	  gcc_unreachable ();
+	delete_insn (loop->begin->insn);
+
+	/* Insert the repeat label before the first doloop_end.
+	   Fill the gap with nops if LOOP insn is less than 2
+	   instructions away than loop->end.  */
+	pru_insert_loop_label_last (loop->end->insn, repeat_label,
+				    loop->end->next != 0);
+
+	/* Emit a pruloop_end (to improve the readability of the output).  */
+	emit_insn_before (gen_pruloop_end (), loop->end->insn);
+
+	/* HACK: TODO: This is usually not needed, but is required for
+	   a few rare cases where a JUMP that breaks the loop
+	   references the LOOP_END address.  In other words, since
+	   we're missing a real "loop_end" instruction, a loop "break"
+	   may accidentally reference the loop end itself, and thus
+	   continuing the cycle.  */
+	for (insn = NEXT_INSN (loop->end->insn);
+	     insn != next_real_insn (loop->end->insn);
+	     insn = NEXT_INSN (insn))
+	  {
+	    if (LABEL_P (insn) && LABEL_NUSES (insn) > 0)
+	      emit_insn_before (gen_nop_loop_guard (), loop->end->insn);
+	  }
+
+	/* Delete the first doloop_end.  */
+	delete_insn (loop->end->insn);
+
+	/* Replace the others with branches to REPEAT_LABEL.  */
+	for (end = loop->end->next; end != 0; end = end->next)
+	  {
+	    rtx_insn *newjmp;
+	    newjmp = emit_jump_insn_before (gen_jump (repeat_label), end->insn);
+	    JUMP_LABEL (newjmp) = repeat_label;
+	    delete_insn (end->insn);
+	    delete_insn (end->fallthrough);
+	  }
+      }
+    else
+      {
+	/* Case (3).  First replace all the doloop_begins with setting
+	   the HW register used for loop counter.  */
+	for (begin = loop->begin; begin != 0; begin = begin->next)
+	  {
+	    insn = gen_move_insn (copy_rtx (begin->counter),
+				  copy_rtx (begin->loop_count));
+	    emit_insn_before (insn, begin->insn);
+	    delete_insn (begin->insn);
+	  }
+
+	/* Replace all the doloop_ends with decrement-and-branch sequences.  */
+	for (end = loop->end; end != 0; end = end->next)
+	  {
+	    rtx reg;
+
+	    start_sequence ();
+
+	    /* Load the counter value into a general register.  */
+	    reg = end->counter;
+	    if (!REG_P (reg) || REGNO (reg) > LAST_NONIO_GP_REG)
+	      {
+		reg = end->scratch;
+		emit_move_insn (copy_rtx (reg), copy_rtx (end->counter));
+	      }
+
+	    /* Decrement the counter.  */
+	    emit_insn (gen_add3_insn (copy_rtx (reg), copy_rtx (reg),
+				      constm1_rtx));
+
+	    /* Copy it back to its original location.  */
+	    if (reg != end->counter)
+	      emit_move_insn (copy_rtx (end->counter), copy_rtx (reg));
+
+	    /* Jump back to the start label.  */
+	    insn = emit_jump_insn (gen_cbranchsi4 (gen_rtx_NE (VOIDmode, reg,
+							       const0_rtx),
+						   reg,
+						   const0_rtx,
+						   end->label));
+
+	    JUMP_LABEL (insn) = end->label;
+	    LABEL_NUSES (end->label)++;
+
+	    /* Emit the whole sequence before the doloop_end.  */
+	    insn = get_insns ();
+	    end_sequence ();
+	    emit_insn_before (insn, end->insn);
+
+	    /* Delete the doloop_end.  */
+	    delete_insn (end->insn);
+	  }
+      }
+}
+
+static void
+pru_reorg (void)
+{
+  rtx_insn *insns = get_insns ();
+
+  compute_bb_for_insn ();
+  df_analyze ();
+
+  /* need correct insn lengths for allowing LOOP instruction
+     emitting due to U8_PCREL limitations.  */
+  shorten_branches (get_insns ());
+
+  /* The generic reorg_loops () is not suitable for PRU because
+     it doesn't handle doloop_begin/end tying.  And we need our
+     doloop_begin emitted before reload.  It is difficult to coalesce
+     UBYTE constant initial loop values into the LOOP insn during
+     machine reorg phase.  */
+  pru_reorg_loop (insns);
+
+  df_finish_pass (false);
+}
+
+enum pru_builtin
+{
+  PRU_BUILTIN_DELAY_CYCLES,
+  PRU_BUILTIN_max
+};
+
+static GTY(()) tree pru_builtins [(int) PRU_BUILTIN_max];
+
+/* Implement TARGET_INIT_BUILTINS.  */
+static void
+pru_init_builtins (void)
+{
+  tree void_ftype_longlong
+    = build_function_type_list (void_type_node,
+				long_long_integer_type_node,
+				NULL);
+
+  pru_builtins[PRU_BUILTIN_DELAY_CYCLES]
+    = add_builtin_function ( "__delay_cycles", void_ftype_longlong,
+			    PRU_BUILTIN_DELAY_CYCLES, BUILT_IN_MD, NULL,
+			    NULL_TREE);
+}
+
+/* Implement TARGET_BUILTIN_DECL.  */
+static tree
+pru_builtin_decl (unsigned code, bool initialize_p ATTRIBUTE_UNUSED)
+{
+  switch (code)
+    {
+    case PRU_BUILTIN_DELAY_CYCLES:
+      return pru_builtins[code];
+    default:
+      return error_mark_node;
+    }
+}
+
+static rtx
+pru_expand_delay_cycles (rtx arg)
+{
+  HOST_WIDE_INT c, n;
+
+  if (GET_CODE (arg) != CONST_INT)
+    {
+      error ("__delay_cycles() only takes constant arguments");
+      return NULL_RTX;
+    }
+
+  c = INTVAL (arg);
+
+  if (HOST_BITS_PER_WIDE_INT > 32)
+    {
+      if (c < 0)
+	{
+	  error ("__delay_cycles only takes non-negative cycle counts.");
+	  return NULL_RTX;
+	}
+    }
+
+  emit_insn (gen_delay_cycles_start (arg));
+
+  /* For 32-bit loops, there's 2 + 2x cycles.  */
+  if (c > 2 * 0xffff + 1)
+    {
+      n = (c - 2) / 2;
+      c -= (n * 2) + 2;
+      if ((unsigned long long) n > 0xffffffffULL)
+	{
+	  error ("__delay_cycles is limited to 32-bit loop counts.");
+	  return NULL_RTX;
+	}
+      emit_insn (gen_delay_cycles_2x_plus2_si (GEN_INT (n)));
+    }
+
+  /* For 16-bit loops, there's 1 + 2x cycles.  */
+  if (c > 2)
+    {
+      n = (c - 1) / 2;
+      c -= (n * 2) + 1;
+
+      emit_insn (gen_delay_cycles_2x_plus1_hi (GEN_INT (n)));
+    }
+
+  while (c > 0)
+    {
+      emit_insn (gen_delay_cycles_1 ());
+      c -= 1;
+    }
+
+  emit_insn (gen_delay_cycles_end (arg));
+
+  return NULL_RTX;
+}
+
+
+/* Implement TARGET_EXPAND_BUILTIN.  Expand an expression EXP that calls
+   a built-in function, with result going to TARGET if that's convenient
+   (and in mode MODE if that's convenient).
+   SUBTARGET may be used as the target for computing one of EXP's operands.
+   IGNORE is nonzero if the value is to be ignored.  */
+
+static rtx
+pru_expand_builtin (tree exp, rtx target ATTRIBUTE_UNUSED,
+		    rtx subtarget ATTRIBUTE_UNUSED,
+		    machine_mode mode ATTRIBUTE_UNUSED,
+		    int ignore ATTRIBUTE_UNUSED)
+{
+  tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0);
+  unsigned int fcode = DECL_FUNCTION_CODE (fndecl);
+  rtx arg1 = expand_normal (CALL_EXPR_ARG (exp, 0));
+
+  if (fcode == PRU_BUILTIN_DELAY_CYCLES)
+    return pru_expand_delay_cycles (arg1);
+
+  internal_error ("bad builtin code");
+
+  return NULL_RTX;
+}
+
+/* Remember the last target of pru_set_current_function.  */
+static GTY(()) tree pru_previous_fndecl;
+
+/* Establish appropriate back-end context for processing the function
+   FNDECL.  The argument might be NULL to indicate processing at top
+   level, outside of any function scope.  */
+static void
+pru_set_current_function (tree fndecl)
+{
+  tree old_tree = (pru_previous_fndecl
+		   ? DECL_FUNCTION_SPECIFIC_TARGET (pru_previous_fndecl)
+		   : NULL_TREE);
+
+  tree new_tree = (fndecl
+		   ? DECL_FUNCTION_SPECIFIC_TARGET (fndecl)
+		   : NULL_TREE);
+
+  if (fndecl && fndecl != pru_previous_fndecl)
+    {
+      pru_previous_fndecl = fndecl;
+      if (old_tree == new_tree)
+	;
+
+      else if (new_tree)
+	{
+	  cl_target_option_restore (&global_options,
+				    TREE_TARGET_OPTION (new_tree));
+	  target_reinit ();
+	}
+
+      else if (old_tree)
+	{
+	  struct cl_target_option *def
+	    = TREE_TARGET_OPTION (target_option_current_node);
+
+	  cl_target_option_restore (&global_options, def);
+	  target_reinit ();
+	}
+    }
+}
+
+
+static scalar_int_mode
+pru_unwind_word_mode (void)
+{
+  return SImode;
+}
+
+
+/* Initialize the GCC target structure.  */
+#undef TARGET_ASM_FUNCTION_PROLOGUE
+#define TARGET_ASM_FUNCTION_PROLOGUE pru_asm_function_prologue
+#undef TARGET_ASM_INTEGER
+#define TARGET_ASM_INTEGER pru_assemble_integer
+
+#undef TARGET_ASM_FILE_START
+#define TARGET_ASM_FILE_START pru_file_start
+
+#undef TARGET_INIT_BUILTINS
+#define TARGET_INIT_BUILTINS pru_init_builtins
+#undef TARGET_EXPAND_BUILTIN
+#define TARGET_EXPAND_BUILTIN pru_expand_builtin
+#undef TARGET_BUILTIN_DECL
+#define TARGET_BUILTIN_DECL pru_builtin_decl
+
+#undef TARGET_FUNCTION_OK_FOR_SIBCALL
+#define TARGET_FUNCTION_OK_FOR_SIBCALL hook_bool_tree_tree_true
+
+#undef TARGET_CAN_ELIMINATE
+#define TARGET_CAN_ELIMINATE pru_can_eliminate
+
+#undef TARGET_MODES_TIEABLE_P
+#define TARGET_MODES_TIEABLE_P pru_modes_tieable_p
+
+#undef TARGET_HARD_REGNO_MODE_OK
+#define TARGET_HARD_REGNO_MODE_OK pru_hard_regno_mode_ok
+
+#undef  TARGET_HARD_REGNO_SCRATCH_OK
+#define TARGET_HARD_REGNO_SCRATCH_OK pru_hard_regno_scratch_ok
+
+#undef TARGET_FUNCTION_ARG
+#define TARGET_FUNCTION_ARG pru_function_arg
+
+#undef TARGET_FUNCTION_ARG_ADVANCE
+#define TARGET_FUNCTION_ARG_ADVANCE pru_function_arg_advance
+
+#undef TARGET_ARG_PARTIAL_BYTES
+#define TARGET_ARG_PARTIAL_BYTES pru_arg_partial_bytes
+
+#undef TARGET_FUNCTION_VALUE
+#define TARGET_FUNCTION_VALUE pru_function_value
+
+#undef TARGET_LIBCALL_VALUE
+#define TARGET_LIBCALL_VALUE pru_libcall_value
+
+#undef TARGET_FUNCTION_VALUE_REGNO_P
+#define TARGET_FUNCTION_VALUE_REGNO_P pru_function_value_regno_p
+
+#undef TARGET_RETURN_IN_MEMORY
+#define TARGET_RETURN_IN_MEMORY pru_return_in_memory
+
+#undef TARGET_MUST_PASS_IN_STACK
+#define TARGET_MUST_PASS_IN_STACK must_pass_in_stack_var_size
+
+#undef TARGET_LEGITIMATE_ADDRESS_P
+#define TARGET_LEGITIMATE_ADDRESS_P pru_legitimate_address_p
+
+#undef TARGET_PREFERRED_RELOAD_CLASS
+#define TARGET_PREFERRED_RELOAD_CLASS pru_preferred_reload_class
+
+#undef TARGET_INIT_LIBFUNCS
+#define TARGET_INIT_LIBFUNCS pru_init_libfuncs
+#undef TARGET_LIBFUNC_GNU_PREFIX
+#define TARGET_LIBFUNC_GNU_PREFIX true
+
+#undef TARGET_RTX_COSTS
+#define TARGET_RTX_COSTS pru_rtx_costs
+
+#undef TARGET_PRINT_OPERAND
+#define TARGET_PRINT_OPERAND pru_print_operand
+
+#undef TARGET_PRINT_OPERAND_ADDRESS
+#define TARGET_PRINT_OPERAND_ADDRESS pru_print_operand_address
+
+#undef TARGET_OPTION_OVERRIDE
+#define TARGET_OPTION_OVERRIDE pru_option_override
+
+#undef TARGET_SET_CURRENT_FUNCTION
+#define TARGET_SET_CURRENT_FUNCTION pru_set_current_function
+
+#undef  TARGET_MACHINE_DEPENDENT_REORG
+#define TARGET_MACHINE_DEPENDENT_REORG  pru_reorg
+
+#undef  TARGET_CAN_USE_DOLOOP_P
+#define TARGET_CAN_USE_DOLOOP_P		pru_can_use_doloop_p
+
+#undef TARGET_INVALID_WITHIN_DOLOOP
+#define TARGET_INVALID_WITHIN_DOLOOP  pru_invalid_within_doloop
+
+/* There are no shared libraries envisioned for PRU.  */
+//#undef TARGET_CXX_USE_ATEXIT_FOR_CXA_ATEXIT
+//#define TARGET_CXX_USE_ATEXIT_FOR_CXA_ATEXIT hook_bool_void_true
+
+#undef  TARGET_UNWIND_WORD_MODE
+#define TARGET_UNWIND_WORD_MODE pru_unwind_word_mode
+
+struct gcc_target targetm = TARGET_INITIALIZER;
+
+#include "gt-pru.h"
diff --git a/gcc/config/pru/pru.h b/gcc/config/pru/pru.h
new file mode 100644
index 00000000000..de1b9100209
--- /dev/null
+++ b/gcc/config/pru/pru.h
@@ -0,0 +1,551 @@ 
+/* Definitions of target machine for TI PRU.
+   Copyright (C) 2014-2018 Free Software Foundation, Inc.
+   Contributed by Dimitar Dimitrov <dimitar@dinux.eu>
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_PRU_H
+#define GCC_PRU_H
+
+#include "config/pru/pru-opts.h"
+
+/* Define built-in preprocessor macros.  */
+#define TARGET_CPU_CPP_BUILTINS()		    \
+  do						    \
+    {						    \
+      builtin_define_std ("__PRU__");		    \
+      builtin_define_std ("__pru__");		    \
+      builtin_define_std ("__PRU_V3__");	    \
+      builtin_define_std ("__LITTLE_ENDIAN__");	    \
+      builtin_define_std ("__little_endian__");	    \
+      /* Trampolines are disabled for now.  */	    \
+      builtin_define_std ("NO_TRAMPOLINES");	    \
+    }						    \
+  while (0)
+
+/* TI ABI implementation is not feature enough (e.g. function pointers are
+   not supported), so we cannot list it as a multilib variant.  To prevent
+   misuse from users, do not link any of the standard libraries.  */
+#define DRIVER_SELF_SPECS			      \
+  "%{mabi=ti:-nodefaultlibs} "			      \
+  "%{mmcu=*:-specs=device-specs/%*%s %<mmcu=*} "
+
+#undef CPP_SPEC
+#define CPP_SPEC					\
+  "%(cpp_device) "					\
+  "%{mabi=ti:-D__PRU_EABI_TI__; :-D__PRU_EABI_GNU__}"
+
+/* Do not relax when in TI ABI mode since TI tools do not always
+   put PRU_S10_PCREL.  */
+#undef  LINK_SPEC
+#define LINK_SPEC					    \
+  "%(link_device) "					    \
+  "%{mabi=ti:--no-relax;:%{mno-relax:--no-relax;:--relax}} "   \
+  "%{shared:%eshared is not supported} "
+
+/* CRT0 is carefully maintained to be compatible with both GNU and TI ABIs.  */
+#undef  STARTFILE_SPEC
+#define STARTFILE_SPEC							\
+  "%{!pg:%{minrt:crt0-minrt.o%s}%{!minrt:crt0.o%s}} %{!mabi=ti:-lgcc} "
+
+#undef  ENDFILE_SPEC
+#define ENDFILE_SPEC "%{!mabi=ti:-lgloss} "
+
+/* TI ABI mandates that ELF symbols do not start with any prefix.  */
+#undef USER_LABEL_PREFIX
+#define USER_LABEL_PREFIX ""
+
+#undef LOCAL_LABEL_PREFIX
+#define LOCAL_LABEL_PREFIX ".L"
+
+/* Storage layout.  */
+
+#define DEFAULT_SIGNED_CHAR 0
+#define BITS_BIG_ENDIAN 0
+#define BYTES_BIG_ENDIAN 0
+#define WORDS_BIG_ENDIAN 0
+
+/* PRU is represented in GCC as an 8-bit CPU with fast 16b and 32bb
+   arithmetic.  */
+#define BITS_PER_WORD 8
+
+#ifdef IN_LIBGCC2
+/* This is to get correct SI and DI modes in libgcc2.c (32 and 64 bits).  */
+#define UNITS_PER_WORD 4
+#else
+/* Width of a word, in units (bytes).  */
+#define UNITS_PER_WORD 1
+#endif
+
+#define POINTER_SIZE 32
+#define BIGGEST_ALIGNMENT 8
+#define STRICT_ALIGNMENT 0
+#define FUNCTION_BOUNDARY 8	/* Func pointers are word-addressed.  */
+#define PARM_BOUNDARY 8
+#define STACK_BOUNDARY 8
+#define MAX_FIXED_MODE_SIZE 64
+
+#define POINTERS_EXTEND_UNSIGNED 1
+
+/* Layout of source language data types.  */
+
+#define INT_TYPE_SIZE 32
+#define SHORT_TYPE_SIZE 16
+#define LONG_TYPE_SIZE 32
+#define LONG_LONG_TYPE_SIZE 64
+#define FLOAT_TYPE_SIZE 32
+#define DOUBLE_TYPE_SIZE 64
+#define LONG_DOUBLE_TYPE_SIZE DOUBLE_TYPE_SIZE
+
+#undef SIZE_TYPE
+#define SIZE_TYPE "unsigned int"
+
+#undef PTRDIFF_TYPE
+#define PTRDIFF_TYPE "int"
+
+
+/* Basic characteristics of PRU registers:
+
+   Regno  Name
+   0      r0		  Caller Saved.  Also used as a static chain register.
+   1      r1		  Caller Saved.  Also used as a temporary by function
+			  profiler and function prologue/epilogue.
+   2      r2       sp	  Stack Pointer
+   3*     r3.w0    ra	  Return Address (16-bit)
+   4      r4       fp	  Frame Pointer
+   5-13   r5-r13	  Callee Saved Registers
+   14-29  r14-r29	  Register Arguments.  Caller Saved Registers.
+   14-15  r14-r15	  Return Location
+   30     r30		  Special I/O register.  Not used by compiler.
+   31     r31		  Special I/O register.  Not used by compiler.
+
+   32     loop_cntr	  Internal register used as a counter by LOOP insns
+
+   33     pc		  Not an actual register
+
+   34     fake_fp	  Fake Frame Pointer (always eliminated)
+   35     fake_ap	  Fake Argument Pointer (always eliminated)
+   36			  First Pseudo Register
+
+   The definitions for all the hard register numbers are located in pru.md.
+*/
+
+#define FIXED_REGISTERS				\
+  {						\
+/*   0 */  0,0,0,0, 0,0,0,0, 1,1,1,1, 1,1,1,1,	\
+/*   4 */  0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0,	\
+/*   8 */  0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0,	\
+/*  12 */  0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0,	\
+/*  16 */  0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0,	\
+/*  20 */  0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0,	\
+/*  24 */  0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0,	\
+/*  28 */  0,0,0,0, 0,0,0,0, 1,1,1,1, 1,1,1,1,	\
+/*  32 */  1,1,1,1, 1,1,1,1, 1,1,1,1, 1,1,1,1	\
+  }
+
+/* Call used == caller saved + fixed regs + args + ret vals.  */
+#define CALL_USED_REGISTERS			\
+  {						\
+/*   0 */  1,1,1,1, 1,1,1,1, 1,1,1,1, 1,1,1,1,	\
+/*   4 */  0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0,	\
+/*   8 */  0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0,	\
+/*  12 */  0,0,0,0, 0,0,0,0, 1,1,1,1, 1,1,1,1,	\
+/*  16 */  1,1,1,1, 1,1,1,1, 1,1,1,1, 1,1,1,1,	\
+/*  20 */  1,1,1,1, 1,1,1,1, 1,1,1,1, 1,1,1,1,	\
+/*  24 */  1,1,1,1, 1,1,1,1, 1,1,1,1, 1,1,1,1,	\
+/*  28 */  1,1,1,1, 1,1,1,1, 1,1,1,1, 1,1,1,1,	\
+/*  32 */  1,1,1,1, 1,1,1,1, 1,1,1,1, 1,1,1,1	\
+  }
+
+#define __pru_RSEQ(X)  (X) * 4 + 0, (X) * 4 + 1, (X) * 4 + 2, (X) * 4 + 3
+#define REG_ALLOC_ORDER							    \
+  {									    \
+    /* Call-clobbered, yet not used for parameters.  */			    \
+    __pru_RSEQ (0),  __pru_RSEQ ( 1),					    \
+									    \
+    __pru_RSEQ (14), __pru_RSEQ (15), __pru_RSEQ (16), __pru_RSEQ (17),	    \
+    __pru_RSEQ (18), __pru_RSEQ (19), __pru_RSEQ (20), __pru_RSEQ (21),	    \
+    __pru_RSEQ (22), __pru_RSEQ (23), __pru_RSEQ (24), __pru_RSEQ (25),	    \
+    __pru_RSEQ (26), __pru_RSEQ (27), __pru_RSEQ (28), __pru_RSEQ (29),	    \
+									    \
+    __pru_RSEQ ( 5), __pru_RSEQ ( 6), __pru_RSEQ ( 7), __pru_RSEQ ( 8),	    \
+    __pru_RSEQ ( 9), __pru_RSEQ (10), __pru_RSEQ (11), __pru_RSEQ (12),	    \
+    __pru_RSEQ (13),							    \
+									    \
+    __pru_RSEQ ( 4),							    \
+    __pru_RSEQ ( 2), __pru_RSEQ ( 3),					    \
+									    \
+    /* I/O and virtual registers.  */					    \
+    __pru_RSEQ (30), __pru_RSEQ (31), __pru_RSEQ (32), __pru_RSEQ (33),	    \
+    __pru_RSEQ (34), __pru_RSEQ (35)					    \
+  }
+
+/* Register Classes.  */
+
+enum reg_class
+{
+  NO_REGS,
+  SIB_REGS,
+  LOOPCNTR_REGS,
+  GP_REGS,
+  ALL_REGS,
+  LIM_REG_CLASSES
+};
+
+#define N_REG_CLASSES (int) LIM_REG_CLASSES
+
+#define REG_CLASS_NAMES   \
+  {  "NO_REGS",		  \
+     "SIB_REGS",	  \
+     "LOOPCNTR_REGS",	  \
+     "GP_REGS",		  \
+     "ALL_REGS" }
+
+#define GENERAL_REGS ALL_REGS
+
+#define REG_CLASS_CONTENTS					\
+  {								\
+    /* NO_REGS	      */ { 0, 0, 0, 0, 0},			\
+    /* SIB_REGS	      */ { 0xf, 0xff000000, ~0, 0xffffff, 0},	\
+    /* LOOPCNTR_REGS  */ { 0, 0, 0, 0, 0xf},			\
+    /* GP_REGS	      */ { ~0, ~0, ~0, ~0, 0},			\
+    /* ALL_REGS	      */ { ~0,~0, ~0, ~0, ~0}			\
+  }
+
+
+#define GP_REG_P(REGNO) ((unsigned)(REGNO) <= LAST_GP_REG)
+#define REGNO_REG_CLASS(REGNO)						    \
+	((REGNO) >= FIRST_ARG_REGNO && (REGNO) <= LAST_ARG_REGNO ? SIB_REGS \
+	 : (REGNO) == STATIC_CHAIN_REGNUM ? SIB_REGS			    \
+	 : (REGNO) == LOOPCNTR_REG ? LOOPCNTR_REGS			    \
+	 : (REGNO) <= LAST_NONIO_GP_REG ? GP_REGS			    \
+	 : ALL_REGS)
+
+#define CLASS_MAX_NREGS(CLASS, MODE) \
+  ((GET_MODE_SIZE (MODE) + UNITS_PER_WORD - 1) / UNITS_PER_WORD)
+
+/* Arbitrarily set to a non-argument register.  Not defined by TI ABI.  */
+#define STATIC_CHAIN_REGNUM      0	/* r0 */
+
+/* Tests for various kinds of constants used in the PRU port.  */
+#define SHIFT_INT(X) ((X) >= 0 && (X) <= 31)
+
+#define UHWORD_INT(X) (IN_RANGE ((X), 0, 0xffff))
+#define SHWORD_INT(X) (IN_RANGE ((X), -32768, 32767))
+#define UBYTE_INT(X) (IN_RANGE ((X), 0, 0xff))
+
+/* Say that the epilogue uses the return address register.  Note that
+   in the case of sibcalls, the values "used by the epilogue" are
+   considered live at the start of the called function.  */
+#define EPILOGUE_USES(REGNO) (epilogue_completed &&	      \
+			      (((REGNO) == RA_REGNO)	      \
+			       || (REGNO) == (RA_REGNO + 1)))
+
+/* EXIT_IGNORE_STACK should be nonzero if, when returning from a function,
+   the stack pointer does not matter.  The value is tested only in
+   functions that have frame pointers.
+   No definition is equivalent to always zero.  */
+
+#define EXIT_IGNORE_STACK 1
+
+/* Trampolines are not supported, but put a define to keep the build.  */
+#define TRAMPOLINE_SIZE 4
+
+/* Stack layout.  */
+#define STACK_GROWS_DOWNWARD  1
+#undef FRAME_GROWS_DOWNWARD
+#define FIRST_PARM_OFFSET(FUNDECL) 0
+
+/* Before the prologue, RA lives in r3.w2.  */
+#define INCOMING_RETURN_ADDR_RTX	gen_rtx_REG (HImode, RA_REGNO)
+
+#define RETURN_ADDR_RTX(C,F) pru_get_return_address (C)
+
+#define DWARF_FRAME_RETURN_COLUMN RA_REGNO
+
+/* The CFA includes the pretend args.  */
+#define ARG_POINTER_CFA_OFFSET(FNDECL) \
+  (gcc_assert ((FNDECL) == current_function_decl), \
+   FIRST_PARM_OFFSET (FNDECL) + crtl->args.pretend_args_size)
+
+/* Frame/arg pointer elimination settings.  */
+#define ELIMINABLE_REGS							\
+{{ ARG_POINTER_REGNUM,   STACK_POINTER_REGNUM},				\
+ { ARG_POINTER_REGNUM,   HARD_FRAME_POINTER_REGNUM},			\
+ { FRAME_POINTER_REGNUM, STACK_POINTER_REGNUM},				\
+ { FRAME_POINTER_REGNUM, HARD_FRAME_POINTER_REGNUM}}
+
+#define INITIAL_ELIMINATION_OFFSET(FROM, TO, OFFSET) \
+  (OFFSET) = pru_initial_elimination_offset ((FROM), (TO))
+
+#define HARD_REGNO_RENAME_OK(OLD_REG, NEW_REG) \
+  pru_hard_regno_rename_ok (OLD_REG, NEW_REG)
+
+/* Calling convention definitions.  */
+#if !defined(IN_LIBGCC2)
+
+#define NUM_ARG_REGS (LAST_ARG_REGNO - FIRST_ARG_REGNO + 1)
+
+typedef struct pru_args
+{
+  bool regs_used[NUM_ARG_REGS];
+} CUMULATIVE_ARGS;
+
+#define INIT_CUMULATIVE_ARGS(CUM, FNTYPE, LIBNAME, FNDECL, N_NAMED_ARGS)  \
+  do {									  \
+      memset ((CUM).regs_used, 0, sizeof ((CUM).regs_used));		  \
+  } while (0)
+
+#define FUNCTION_ARG_REGNO_P(REGNO) \
+  ((REGNO) >= FIRST_ARG_REGNO && (REGNO) <= LAST_ARG_REGNO)
+
+/* Passing function arguments on stack.  */
+#define PUSH_ARGS 0
+#define ACCUMULATE_OUTGOING_ARGS 1
+
+/* We define TARGET_RETURN_IN_MEMORY, so set to zero.  */
+#define DEFAULT_PCC_STRUCT_RETURN 0
+
+/* Profiling.  */
+#define PROFILE_BEFORE_PROLOGUE
+#define NO_PROFILE_COUNTERS 1
+#define FUNCTION_PROFILER(FILE, LABELNO) \
+  pru_function_profiler ((FILE), (LABELNO))
+
+#endif	/* IN_LIBGCC2 */
+
+/* Addressing modes.  */
+
+#define CONSTANT_ADDRESS_P(X) \
+  (CONSTANT_P (X) && memory_address_p (SImode, X))
+
+#define MAX_REGS_PER_ADDRESS 2
+#define BASE_REG_CLASS ALL_REGS
+#define INDEX_REG_CLASS ALL_REGS
+
+#define REGNO_OK_FOR_BASE_P(REGNO) pru_regno_ok_for_base_p ((REGNO), true)
+#define REGNO_OK_FOR_INDEX_P(REGNO) pru_regno_ok_for_index_p ((REGNO), true)
+
+/* Limited by the insns in pru-ldst-multiple.md.  */
+#define MOVE_MAX 8
+#define SLOW_BYTE_ACCESS 1
+
+/* It is as good to call a constant function address as to call an address
+   kept in a register.  */
+#define NO_FUNCTION_CSE 1
+
+/* Define output assembler language.  */
+
+#define ASM_APP_ON "#APP\n"
+#define ASM_APP_OFF "#NO_APP\n"
+
+#define ASM_COMMENT_START "# "
+
+#define GLOBAL_ASM_OP "\t.global\t"
+
+#define __pru_name_R(X)  X".b0", X".b1", X".b2", X".b3"
+#define REGISTER_NAMES		  \
+  {				  \
+    __pru_name_R ("r0"),	  \
+    __pru_name_R ("r1"),	  \
+    __pru_name_R ("r2"),	  \
+    __pru_name_R ("r3"),	  \
+    __pru_name_R ("r4"),	  \
+    __pru_name_R ("r5"),	  \
+    __pru_name_R ("r6"),	  \
+    __pru_name_R ("r7"),	  \
+    __pru_name_R ("r8"),	  \
+    __pru_name_R ("r9"),	  \
+    __pru_name_R ("r10"),	  \
+    __pru_name_R ("r11"),	  \
+    __pru_name_R ("r12"),	  \
+    __pru_name_R ("r13"),	  \
+    __pru_name_R ("r14"),	  \
+    __pru_name_R ("r15"),	  \
+    __pru_name_R ("r16"),	  \
+    __pru_name_R ("r17"),	  \
+    __pru_name_R ("r18"),	  \
+    __pru_name_R ("r19"),	  \
+    __pru_name_R ("r20"),	  \
+    __pru_name_R ("r21"),	  \
+    __pru_name_R ("r22"),	  \
+    __pru_name_R ("r23"),	  \
+    __pru_name_R ("r24"),	  \
+    __pru_name_R ("r25"),	  \
+    __pru_name_R ("r26"),	  \
+    __pru_name_R ("r27"),	  \
+    __pru_name_R ("r28"),	  \
+    __pru_name_R ("r29"),	  \
+    __pru_name_R ("r30"),	  \
+    __pru_name_R ("r31"),	  \
+    __pru_name_R ("loopcntr_reg"), \
+    __pru_name_R ("pc"),	  \
+    __pru_name_R ("fake_fp"),	  \
+    __pru_name_R ("fake_ap"),	  \
+}
+
+#define __pru_overlap_R(X)	      \
+  { "r" #X	, X * 4	    ,  4 },   \
+  { "r" #X ".w0", X * 4 + 0 ,  2 },   \
+  { "r" #X ".w1", X * 4 + 1 ,  2 },   \
+  { "r" #X ".w2", X * 4 + 2 ,  2 }
+
+#define OVERLAPPING_REGISTER_NAMES  \
+  {				    \
+    /* Aliases.  */		    \
+    { "sp", 2 * 4, 4 },		    \
+    { "ra", 3 * 4, 2 },		    \
+    { "fp", 4 * 4, 4 },		    \
+    __pru_overlap_R (0),	    \
+    __pru_overlap_R (1),	    \
+    __pru_overlap_R (2),	    \
+    __pru_overlap_R (3),	    \
+    __pru_overlap_R (4),	    \
+    __pru_overlap_R (5),	    \
+    __pru_overlap_R (6),	    \
+    __pru_overlap_R (7),	    \
+    __pru_overlap_R (8),	    \
+    __pru_overlap_R (9),	    \
+    __pru_overlap_R (10),	    \
+    __pru_overlap_R (11),	    \
+    __pru_overlap_R (12),	    \
+    __pru_overlap_R (13),	    \
+    __pru_overlap_R (14),	    \
+    __pru_overlap_R (15),	    \
+    __pru_overlap_R (16),	    \
+    __pru_overlap_R (17),	    \
+    __pru_overlap_R (18),	    \
+    __pru_overlap_R (19),	    \
+    __pru_overlap_R (20),	    \
+    __pru_overlap_R (21),	    \
+    __pru_overlap_R (22),	    \
+    __pru_overlap_R (23),	    \
+    __pru_overlap_R (24),	    \
+    __pru_overlap_R (25),	    \
+    __pru_overlap_R (26),	    \
+    __pru_overlap_R (27),	    \
+    __pru_overlap_R (28),	    \
+    __pru_overlap_R (29),	    \
+    __pru_overlap_R (30),	    \
+    __pru_overlap_R (31),	    \
+}
+
+#define ASM_OUTPUT_ADDR_VEC_ELT(FILE, VALUE)				    \
+  do									    \
+    {									    \
+      fputs (integer_asm_op (POINTER_SIZE / BITS_PER_UNIT, TRUE), FILE);    \
+      fprintf (FILE, "%%pmem(.L%u)\n", (unsigned) (VALUE));		    \
+    }									    \
+  while (0)
+
+#define ASM_OUTPUT_ADDR_DIFF_ELT(STREAM, BODY, VALUE, REL)		    \
+  do									    \
+    {									    \
+      fputs (integer_asm_op (POINTER_SIZE / BITS_PER_UNIT, TRUE), STREAM);  \
+      fprintf (STREAM, "%%pmem(.L%u-.L%u)\n", (unsigned) (VALUE),	    \
+	       (unsigned) (REL));					    \
+    }									    \
+  while (0)
+
+/* Section directives.  */
+
+/* Output before read-only data.  */
+#define TEXT_SECTION_ASM_OP "\t.section\t.text"
+
+/* Output before writable data.  */
+#define DATA_SECTION_ASM_OP "\t.section\t.data"
+
+/* Output before uninitialized data.  */
+#define BSS_SECTION_ASM_OP "\t.section\t.bss"
+
+#define CTORS_SECTION_ASM_OP "\t.section\t.init_array,\"aw\",%init_array"
+#define DTORS_SECTION_ASM_OP "\t.section\t.fini_array,\"aw\",%fini_array"
+
+#undef INIT_SECTION_ASM_OP
+#undef FINI_SECTION_ASM_OP
+#define INIT_ARRAY_SECTION_ASM_OP CTORS_SECTION_ASM_OP
+#define FINI_ARRAY_SECTION_ASM_OP DTORS_SECTION_ASM_OP
+
+/* Since we use .init_array/.fini_array we don't need the markers at
+   the start and end of the ctors/dtors arrays.  */
+#define CTOR_LIST_BEGIN asm (CTORS_SECTION_ASM_OP)
+#define CTOR_LIST_END		/* empty */
+#define DTOR_LIST_BEGIN asm (DTORS_SECTION_ASM_OP)
+#define DTOR_LIST_END		/* empty */
+
+#undef TARGET_ASM_CONSTRUCTOR
+#define TARGET_ASM_CONSTRUCTOR pru_elf_asm_constructor
+
+#undef TARGET_ASM_DESTRUCTOR
+#define TARGET_ASM_DESTRUCTOR pru_elf_asm_destructor
+
+#define ASM_OUTPUT_ALIGN(FILE, LOG)		      \
+  do {						      \
+    fprintf ((FILE), "%s%d\n", ALIGN_ASM_OP, (LOG));  \
+  } while (0)
+
+#undef  ASM_OUTPUT_ALIGNED_COMMON
+#define ASM_OUTPUT_ALIGNED_COMMON(FILE, NAME, SIZE, ALIGN)		\
+do									\
+  {									\
+    fprintf ((FILE), "%s", COMMON_ASM_OP);				\
+    assemble_name ((FILE), (NAME));					\
+    fprintf ((FILE), "," HOST_WIDE_INT_PRINT_UNSIGNED ",%u\n", (SIZE),	\
+	     (ALIGN) / BITS_PER_UNIT);					\
+  }									\
+while (0)
+
+
+/* This says how to output assembler code to declare an
+   uninitialized internal linkage data object.  Under SVR4,
+   the linker seems to want the alignment of data objects
+   to depend on their types.  We do exactly that here.  */
+
+#undef  ASM_OUTPUT_ALIGNED_LOCAL
+#define ASM_OUTPUT_ALIGNED_LOCAL(FILE, NAME, SIZE, ALIGN)		\
+do {									\
+  switch_to_section (bss_section);					\
+  ASM_OUTPUT_TYPE_DIRECTIVE (FILE, NAME, "object");			\
+  if (!flag_inhibit_size_directive)					\
+    ASM_OUTPUT_SIZE_DIRECTIVE (FILE, NAME, SIZE);			\
+  ASM_OUTPUT_ALIGN ((FILE), exact_log2 ((ALIGN) / BITS_PER_UNIT));      \
+  ASM_OUTPUT_LABEL (FILE, NAME);					\
+  ASM_OUTPUT_SKIP ((FILE), (SIZE) ? (SIZE) : 1);			\
+} while (0)
+
+/* Misc parameters.  */
+
+#define STORE_FLAG_VALUE 1
+#define Pmode SImode
+#define FUNCTION_MODE Pmode
+
+#define CASE_VECTOR_MODE Pmode
+
+/* Jumps are cheap on PRU.  */
+#define LOGICAL_OP_NON_SHORT_CIRCUIT		0
+
+/* Unfortunately the LBBO instruction does not zero-extend data.  */
+#undef LOAD_EXTEND_OP
+
+#undef WORD_REGISTER_OPERATIONS
+
+#define HAS_LONG_UNCOND_BRANCH			1
+#define HAS_LONG_COND_BRANCH			1
+
+#define REGISTER_TARGET_PRAGMAS() pru_register_pragmas ()
+
+#endif /* GCC_PRU_H */
diff --git a/gcc/config/pru/pru.md b/gcc/config/pru/pru.md
new file mode 100644
index 00000000000..328fb484847
--- /dev/null
+++ b/gcc/config/pru/pru.md
@@ -0,0 +1,905 @@ 
+;; Machine Description for TI PRU.
+;; Copyright (C) 2014-2018 Free Software Foundation, Inc.
+;; Contributed by Dimitar Dimitrov <dimitar@dinux.eu>
+;; Based on the NIOS2 GCC port.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+;; Register numbers.
+(define_constants
+  [
+   (FIRST_ARG_REGNO	    56)	; Argument registers.
+   (LAST_ARG_REGNO	    119)	;
+   (FIRST_RETVAL_REGNO	    56)	; Return value registers.
+   (LAST_RETVAL_REGNO	    60)	;
+   (PROLOGUE_TEMP_REGNO	    4)	; Temporary register to use in prologue.
+
+   (RA_REGNO		    14)	; Return address register r3.w2.
+   (FP_REGNO		    16)	; Frame pointer register.
+   (LAST_NONIO_GP_REG	    119)	; Last non-I/O general purpose register.
+   (LOOPCNTR_REG	    128)	; internal LOOP counter register
+   (LAST_GP_REG		    132)	; Last general purpose register.
+
+   ;; Target register definitions.
+   (STACK_POINTER_REGNUM	8)
+   (HARD_FRAME_POINTER_REGNUM	FP_REGNO)
+   (PC_REGNUM			132)
+   (FRAME_POINTER_REGNUM	136)
+   (ARG_POINTER_REGNUM		140)
+   (FIRST_PSEUDO_REGISTER	144)
+
+   ;; Misc
+   (MAX_XBBO_BURST_LEN	    29)	; Artificially limited by GCC - see how
+				; genextract.c uses [a-zA-Z] to record
+				; "path to a vector".  Also see the check
+				; for MAX_RECOG_OPERANDS in regrename.c
+  ]
+)
+
+;; Enumeration of UNSPECs.
+
+(define_c_enum "unspecv" [
+  UNSPECV_DELAY_CYCLES_START
+  UNSPECV_DELAY_CYCLES_END
+  UNSPECV_DELAY_CYCLES_2X_HI
+  UNSPECV_DELAY_CYCLES_2X_SI
+  UNSPECV_DELAY_CYCLES_1
+
+  UNSPECV_LOOP_BEGIN
+  UNSPECV_LOOP_END
+
+  UNSPECV_BLOCKAGE
+])
+
+; Length of an instruction (in bytes).
+(define_attr "length" "" (const_int 4))
+(define_attr "type"
+  "unknown,complex,control,alu,cond_alu,st,ld,shift"
+  (const_string "complex"))
+
+(define_asm_attributes
+ [(set_attr "length" "4")
+  (set_attr "type" "complex")])
+
+; There is no pipeline, so our scheduling description is simple.
+(define_automaton "pru")
+(define_cpu_unit "cpu" "pru")
+
+(define_insn_reservation "everything" 1 (match_test "true") "cpu")
+
+(include "predicates.md")
+(include "constraints.md")
+
+;; All supported direct move-modes
+(define_mode_iterator MOVMODE [QI QQ UQQ
+			       HI HQ UHQ HA UHA
+			       SI SQ USQ SA USA
+			       SF])
+
+(define_mode_iterator MOV32 [SI SF SD SQ USQ])
+(define_mode_iterator MOV64 [DI DF DD DQ UDQ])
+(define_mode_iterator QISI [QI HI SI])
+(define_mode_iterator HISI [HI SI])
+(define_mode_iterator QIHI [QI HI])
+(define_mode_iterator SFDF [SF DF])
+
+;; EQS0/0 for extension source 0/1 and EQD for extension destination patterns.
+(define_mode_iterator EQS0 [QI HI SI])
+(define_mode_iterator EQS1 [QI HI SI])
+(define_mode_iterator EQD [QI HI SI])
+
+; Not recommended.  Please use %0 instead!
+(define_mode_attr regwidth [(QI ".b0") (HI ".w0") (SI "")])
+
+;; Move instructions
+
+(define_expand "mov<mode>"
+  [(set (match_operand:MOVMODE 0 "nonimmediate_operand" "")
+	(match_operand:MOVMODE 1 "general_operand"       ""))]
+  ""
+  "
+  {
+    /* It helps to split constant loading and memory access
+       early, so that the LDI/LDI32 instructions can be hoisted
+       outside a loop body.  */
+    if (MEM_P (operands[0]))
+      operands[1] = force_reg (<MODE>mode, operands[1]);
+  }
+  "
+)
+
+;; Leave mem and reg operands in the same insn.  Otherwise LRA gets
+;; confused, and gcc.target/pru/pr64366.c triggers infinite loops in reload.
+(define_insn "prumov<mode>"
+  [(set (match_operand:MOVMODE 0 "nonimmediate_operand" "=m,r,r")
+	(match_operand:MOVMODE 1 "nonimmediate_operand" "r,m,r"))]
+  ""
+  "@
+    sb%B0o\\t%b1, %0, %S0
+    lb%B1o\\t%b0, %1, %S1
+    mov\\t%0, %1"
+  [(set_attr "type" "st,ld,alu")
+   (set_attr "length" "4,4,4")])
+
+
+;; Split loading of integer constants into a distinct pattern, in order to
+;; prevent CSE from considering "SET (MEM, CONST_INT)" as a valid insn
+;; selection.  This fixes an abnormally long compile time exposed by
+;; gcc.dg/pr48141.c
+;;
+;; Note: Assume that Program Mem (T constraint) can fit in 16 bits!
+(define_insn "prumov_ldi<mode>"
+  [(set (match_operand:QIHI 0 "register_operand"	"=r,r,r")
+	(match_operand:VOID 1 "immediate_operand"       "T,J,N"))]
+  ""
+  "@
+    ldi\\t%0, %%pmem(%1)
+    ldi\\t%0, %1
+    ldi\\t%0, (%1) & 0xffff"
+  [(set_attr "type" "alu,alu,alu")
+   (set_attr "length" "4,4,4")])
+
+(define_insn "prumov_ldisisf<mode>"
+  [(set (match_operand:MOV32 0 "register_operand"	 "=r,r,r")
+	(match_operand:MOV32 1 "immediate_operand"       "T,J,iF"))]
+  ""
+  "@
+    ldi\\t%F0, %%pmem(%1)
+    ldi\\t%F0, %1
+    ldi32\\t%F0, %1"
+  [(set_attr "type" "alu,alu,alu")
+   (set_attr "length" "4,4,8")])
+
+; I cannot think of any reason for the core to pass a 64-bit symbolic
+; constants.  Hence simplify the rule and handle only numeric constants.
+;
+; Note: Unlike the arithmetics, here we cannot use "&" output modifier.
+; GCC expects to be able to move registers around "no matter what".
+; Forcing DI reg alignment (akin to microblaze's HARD_REGNO_MODE_OK)
+; does not seem efficient, and will violate TI ABI.
+(define_insn "mov<mode>"
+  [(set (match_operand:MOV64 0 "nonimmediate_operand" "=m,r,r,r,r,r")
+	(match_operand:MOV64 1 "general_operand"       "r,m,r,T,J,nF"))]
+  ""
+  {
+    switch (which_alternative)
+    {
+      case 0:
+	return "sb%B0o\\t%b1, %0, %S0";
+      case 1:
+	return "lb%B1o\\t%b0, %1, %S1";
+      case 2:
+	/* careful with overlapping source and destination regs.  */
+	gcc_assert (GP_REG_P (REGNO (operands[0])));
+	gcc_assert (GP_REG_P (REGNO (operands[1])));
+	if (REGNO (operands[0]) == (REGNO (operands[1]) + 4))
+	  return "mov\\t%N0, %N1\;mov\\t%F0, %F1";
+	else
+	  return "mov\\t%F0, %F1\;mov\\t%N0, %N1";
+      case 3:
+	return "ldi\\t%F0, %%pmem(%1)\;ldi\\t%N0, 0";
+      case 4:
+	return "ldi\\t%F0, %1\;ldi\\t%N0, 0";
+      case 5:
+	return "ldi32\\t%F0, %w1\;"
+	       "ldi32\\t%N0, %W1";
+      default:
+	gcc_unreachable ();
+    }
+  }
+  [(set_attr "type" "st,ld,alu,alu,alu,alu")
+   (set_attr "length" "4,4,8,8,8,16")])
+
+;; Load multiple
+;;   op0: first of the consecutive registers
+;;   op1: first memory location
+;;   op2: number of consecutive registers
+(define_expand "load_multiple"
+  [(match_par_dup 3 [(set (match_operand:QI 0 "" "")
+			  (match_operand:QI 1 "" ""))
+		     (use (match_operand:SI 2 "" ""))])]
+  ""
+  "
+{
+  int first_regno, count, i;
+
+  /* Support only loading a constant number of registers from memory and
+     only if at least one register.  */
+  if (GET_CODE (operands[2]) != CONST_INT
+      || INTVAL (operands[2]) < 1
+      || INTVAL (operands[2]) > MAX_XBBO_BURST_LEN
+      || GET_CODE (operands[1]) != MEM
+      || GET_CODE (operands[0]) != REG
+      || (REGNO (operands[0]) + INTVAL (operands[2]) - 1) > LAST_NONIO_GP_REG)
+    FAIL;
+
+  count = INTVAL (operands[2]);
+  first_regno = REGNO (operands[0]);
+
+  operands[3] = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (count));
+
+  for (i = 0; i < count; i++)
+    XVECEXP (operands[3], 0, i)
+      = gen_rtx_SET (gen_rtx_REG (QImode, first_regno + i),
+		     gen_rtx_MEM (QImode,
+				  plus_constant (Pmode,
+						 XEXP (operands[1], 0),
+						 i * UNITS_PER_WORD)));
+}")
+
+;; Store multiple
+;;   op0: first memory location
+;;   op1: first of the consecutive registers
+;;   op2: number of consecutive registers
+(define_expand "store_multiple"
+  [(match_par_dup 3 [(set (match_operand:QI 0 "" "")
+			  (match_operand:QI 1 "" ""))
+		     (use (match_operand:SI 2 "" ""))])]
+  ""
+  "
+{
+  int first_regno, count, i;
+
+  /* Support only storing a constant number of registers to memory and
+     only if at least one register.  */
+  if (GET_CODE (operands[2]) != CONST_INT
+      || INTVAL (operands[2]) < 1
+      || INTVAL (operands[2]) > MAX_XBBO_BURST_LEN
+      || GET_CODE (operands[0]) != MEM
+      || GET_CODE (operands[1]) != REG
+      || (REGNO (operands[1]) + INTVAL (operands[2]) - 1) > LAST_NONIO_GP_REG)
+    FAIL;
+
+  count = INTVAL (operands[2]);
+  first_regno = REGNO (operands[1]);
+
+  operands[3] = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (count));
+
+  for (i = 0; i < count; i++)
+    XVECEXP (operands[3], 0, i)
+      = gen_rtx_SET (gen_rtx_MEM (QImode,
+				  plus_constant (Pmode,
+						 XEXP (operands[0], 0),
+						 i * UNITS_PER_WORD)),
+		     gen_rtx_REG (QImode, first_regno + i));
+}")
+
+;; Include the machine-generated patterns for all LBBO/SBBO multuple-reg
+;; lengths.  This is needed due to limitations of GCC.  See
+;; http://article.gmane.org/gmane.comp.gcc.devel/54458
+(include "pru-ldst-multiple.md")
+
+;; Zero extension patterns
+;;
+;; Unfortunately we cannot use lbbo to load AND zero-extent a value.
+;; The burst length parameter of the LBBO instruction designates not only
+;; the number of memory data bytes fetched, but also the number of register
+;; byte fields written.
+(define_expand "zero_extend<EQS0:mode><EQD:mode>2"
+  [(set (match_operand:EQD 0 "register_operand" "=r")
+	(zero_extend:EQD (match_operand:EQS0 1 "register_operand" "r")))]
+  ""
+  ""
+  [(set_attr "type"     "alu")])
+
+(define_insn "*zero_extend<EQS0:mode><EQD:mode>2"
+  [(set (match_operand:EQD 0 "register_operand" "=r")
+	(zero_extend:EQD (match_operand:EQS0 1 "register_operand" "r")))]
+  ""
+  "mov\\t%0, %1"
+  [(set_attr "type"     "alu")])
+
+;; Sign extension patterns.  We have to emulate them due to lack of
+;; signed operations in PRU's ALU.
+
+(define_insn "extend<EQS0:mode><EQD:mode>2"
+  [(set (match_operand:EQD 0 "register_operand"			  "=r")
+	(sign_extend:EQD (match_operand:EQS0 1 "register_operand"  "r")))]
+  ""
+  {
+    return pru_output_sign_extend (operands);
+  }
+  [(set_attr "type" "complex")
+   (set_attr "length" "12")])
+
+;; Bit extraction
+;; We define it solely to allow combine to choose SImode
+;; for word mode when trying to match our cbranch_qbbx_* insn.
+;;
+;; Check how combine.c:make_extraction() uses
+;; get_best_reg_extraction_insn() to select the op size.
+(define_insn "extzv<mode>"
+  [(set (match_operand:QISI 0 "register_operand"	"=r")
+	  (zero_extract:QISI
+	   (match_operand:QISI 1 "register_operand"	"r")
+	   (match_operand:QISI 2 "const_int_operand"	"i")
+	   (match_operand:QISI 3 "const_int_operand"	"i")))]
+  ""
+  "lsl\\t%0, %1, (%S0 * 8 - %2 - %3)\;lsr\\t%0, %0, (%S0 * 8 - %2)"
+  [(set_attr "type" "complex")
+   (set_attr "length" "8")])
+
+
+
+;; Arithmetic Operations
+
+(define_expand "add<mode>3"
+  [(set (match_operand:QISI 0 "register_operand"		  "=r,r,r")
+	(plus:QISI (match_operand:QISI 1 "register_operand"   "%r,r,r")
+		 (match_operand:QISI 2 "nonmemory_operand"  "r,I,M")))]
+  ""
+  ""
+  [(set_attr "type" "alu")])
+
+(define_insn "adddi3"
+  [(set (match_operand:DI 0 "register_operand"		    "=&r,&r,&r")
+	(plus:DI (match_operand:DI 1 "register_operand"	    "%r,r,r")
+		 (match_operand:DI 2 "reg_or_ubyte_operand" "r,I,M")))]
+  ""
+  "@
+   add\\t%F0, %F1, %F2\;adc\\t%N0, %N1, %N2
+   add\\t%F0, %F1, %2\;adc\\t%N0, %N1, 0
+   sub\\t%F0, %F1, %n2\;suc\\t%N0, %N1, 0"
+  [(set_attr "type" "alu,alu,alu")
+   (set_attr "length" "8,8,8")])
+
+(define_expand "sub<mode>3"
+  [(set (match_operand:QISI 0 "register_operand"		      "=r,r,r")
+	(minus:QISI (match_operand:QISI 1 "reg_or_ubyte_operand"  "r,r,I")
+		  (match_operand:QISI 2 "reg_or_ubyte_operand"  "r,I,r")))]
+  ""
+  ""
+  [(set_attr "type" "alu")])
+
+(define_insn "subdi3"
+  [(set (match_operand:DI 0 "register_operand"		    "=&r,&r,&r")
+	(minus:DI (match_operand:DI 1 "register_operand"    "r,r,I")
+		 (match_operand:DI 2 "reg_or_ubyte_operand" "r,I,r")))]
+  ""
+  "@
+   sub\\t%F0, %F1, %F2\;suc\\t%N0, %N1, %N2
+   sub\\t%F0, %F1, %2\;suc\\t%N0, %N1, 0
+   rsb\\t%F0, %F2, %1\;rsc\\t%N0, %N2, 0"
+  [(set_attr "type" "alu,alu,alu")
+   (set_attr "length" "8,8,8")])
+
+;;  Negate and ones complement
+
+(define_expand "neg<mode>2"
+  [(set (match_operand:QISI 0 "register_operand"		"=r")
+	(neg:QISI (match_operand:QISI 1 "register_operand"	"r")))]
+  ""
+  ""
+  [(set_attr "type" "alu")])
+
+(define_expand "one_cmpl<mode>2"
+  [(set (match_operand:QISI 0 "register_operand"		"=r")
+	(not:QISI (match_operand:QISI 1 "register_operand"	"r")))]
+  ""
+  ""
+  [(set_attr "type" "alu")])
+
+;;  Integer logical Operations
+;;
+;; TODO - add optimized cases that exploit the fact that we can get away
+;; with a single machine op for special constants, e.g. UBYTE << (0/8/16/24)
+
+(define_code_iterator LOGICAL [and ior xor umin umax])
+(define_code_attr logical_asm [(and "and") (ior "or") (xor "xor") (umin "min") (umax "max")])
+
+(define_code_iterator LOGICAL_BITOP [and ior xor])
+(define_code_attr logical_bitop_asm [(and "and") (ior "or") (xor "xor")])
+
+(define_expand "<code><mode>3"
+  [(set (match_operand:QISI 0 "register_operand"			"=r")
+	(LOGICAL:QISI (match_operand:QISI 1 "register_operand"	"%r")
+		    (match_operand:QISI 2 "reg_or_ubyte_operand"	"rI")))]
+  ""
+  ""
+  [(set_attr "type" "alu")])
+
+; Combine phase tends to produce this expression when given the following
+; expression: "uint dst = (ushort) op0 & (uint) op1;
+;
+; TODO - fix properly! Understand whether we need to fix combine core, augment
+; our hooks, or PRU is giving incorrect costs for subexpressions.
+(define_insn "*and_combine_zero_extend_workaround<mode>"
+ [(set (match_operand:SI 0 "register_operand"       "=r")
+       (zero_extend:SI
+	(subreg:EQD
+	  (and:SI
+	    (match_operand:SI 1 "register_operand"  "%r")
+	    (match_operand:SI 2 "register_operand"  "rI"))
+	  0)))]
+ ""
+ "and\\t%0, %1, %F2<EQD:regwidth>"
+ [(set_attr "type" "alu")])
+
+
+;;  Shift instructions
+
+(define_code_iterator SHIFT  [ashift lshiftrt])
+(define_code_attr shift_op   [(ashift "ashl") (lshiftrt "lshr")])
+(define_code_attr shift_asm  [(ashift "lsl") (lshiftrt "lsr")])
+
+(define_expand "<shift_op><mode>3"
+  [(set (match_operand:QISI 0 "register_operand"		  "=r")
+	(SHIFT:QISI (match_operand:QISI 1 "register_operand"  "r")
+		  (match_operand:QISI 2 "shift_operand"	  "rL")))]
+  ""
+  ""
+  [(set_attr "type" "shift")])
+
+; LRA cannot cope with clobbered op2, hence the scratch register.
+(define_insn "ashr<mode>3"
+  [(set (match_operand:QISI 0 "register_operand" "=&r,r")
+	  (ashiftrt:QISI
+	   (match_operand:QISI 1 "register_operand" "0,r")
+	   (match_operand:QISI 2 "reg_or_const_1_operand" "r,P")))
+    (clobber (match_scratch:QISI 3 "=r,X"))]
+  ""
+  "@
+   mov %3, %2\;ASHRLP%=:\;qbeq ASHREND%=, %3, 0\; sub %3, %3, 1\; lsr\\t%0, %0, 1\; qbbc ASHRLP%=, %0, (%S0 * 8) - 2\; set %0, %0, (%S0 * 8) - 1\; jmp ASHRLP%=\;ASHREND%=:
+   lsr\\t%0, %1, 1\;qbbc LSIGN%=, %0, (%S0 * 8) - 2\;set %0, %0, (%S0 * 8) - 1\;LSIGN%=:"
+  [(set_attr "type" "complex,alu")
+   (set_attr "length" "28,4")])
+
+
+;; Include ALU patterns with zero-extension of operands.  That's where
+;; the real insns are defined.
+
+(include "alu-zext.md")
+
+(define_insn "<code>di3"
+  [(set (match_operand:DI 0 "register_operand" "=&r,&r")
+	  (LOGICAL_BITOP:DI
+	   (match_operand:DI 1 "register_operand"     "%r,r")
+	   (match_operand:DI 2 "reg_or_ubyte_operand" "r,I")))]
+  ""
+  "@
+   <logical_bitop_asm>\\t%F0, %F1, %F2\;<logical_bitop_asm>\\t%N0, %N1, %N2
+   <logical_bitop_asm>\\t%F0, %F1, %2\;<logical_bitop_asm>\\t%N0, %N1, 0"
+  [(set_attr "type" "alu,alu")
+   (set_attr "length" "8,8")])
+
+
+(define_insn "one_cmpldi2"
+  [(set (match_operand:DI 0 "register_operand"		"=r")
+	(not:DI (match_operand:DI 1 "register_operand"	"r")))]
+  ""
+  {
+    /* careful with overlapping source and destination regs.  */
+    gcc_assert (GP_REG_P (REGNO (operands[0])));
+    gcc_assert (GP_REG_P (REGNO (operands[1])));
+    if (REGNO (operands[0]) == (REGNO (operands[1]) + 4))
+      return "not\\t%N0, %N1\;not\\t%F0, %F1";
+    else
+      return "not\\t%F0, %F1\;not\\t%N0, %N1";
+  }
+  [(set_attr "type" "alu")
+   (set_attr "length" "8")])
+
+;; Multiply instruction.  Idea for fixing registers comes from the AVR backend.
+
+(define_expand "mulsi3"
+  [(set (match_operand:SI 0 "register_operand" "")
+	(mult:SI (match_operand:SI 1 "register_operand" "")
+		 (match_operand:SI 2 "register_operand" "")))]
+  ""
+  {
+     emit_insn (gen_mulsi3_fixinp (operands[0], operands[1], operands[2]));
+     DONE;
+  })
+
+
+(define_expand "mulsi3_fixinp"
+  [(set (reg:SI 112) (match_operand:SI 1 "register_operand" ""))
+   (set (reg:SI 116) (match_operand:SI 2 "register_operand" ""))
+   (set (reg:SI 104) (mult:SI (reg:SI 112) (reg:SI 116)))
+   (set (match_operand:SI 0 "register_operand" "") (reg:SI 104))]
+  ""
+  {
+  })
+
+(define_insn "*mulsi3_prumac"
+  [(set (reg:SI 104) (mult:SI (reg:SI 112) (reg:SI 116)))]
+  ""
+  "nop\;xin\\t0, r26, 4"
+  [(set_attr "type" "alu")
+   (set_attr "length" "8")])
+
+;; Prologue, Epilogue and Return
+
+(define_expand "prologue"
+  [(const_int 1)]
+  ""
+{
+  pru_expand_prologue ();
+  DONE;
+})
+
+(define_expand "epilogue"
+  [(return)]
+  ""
+{
+  pru_expand_epilogue (false);
+  DONE;
+})
+
+(define_expand "sibcall_epilogue"
+  [(return)]
+  ""
+{
+  pru_expand_epilogue (true);
+  DONE;
+})
+
+(define_insn "return"
+  [(simple_return)]
+  "pru_can_use_return_insn ()"
+  "ret")
+
+(define_insn "simple_return"
+  [(simple_return)]
+  ""
+  "ret")
+
+;; Block any insns from being moved before this point, since the
+;; profiling call to mcount can use various registers that aren't
+;; saved or used to pass arguments.
+
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)]
+  ""
+  ""
+  [(set_attr "type" "unknown")
+   (set_attr "length" "0")])
+
+;;  Jumps and calls
+
+(define_insn "indirect_jump"
+  [(set (pc) (match_operand:SI 0 "register_operand" "r"))]
+  ""
+  "jmp\\t%0"
+  [(set_attr "type" "control")])
+
+(define_insn "jump"
+  [(set (pc)
+	(label_ref (match_operand 0 "" "")))]
+  ""
+  "jmp\\t%%label(%l0)"
+  [(set_attr "type" "control")
+   (set_attr "length" "4")])
+
+
+(define_expand "call"
+  [(parallel [(call (match_operand 0 "" "")
+		    (match_operand 1 "" ""))
+	      (clobber (reg:HI RA_REGNO))])]
+  ""
+  "")
+
+(define_expand "call_value"
+  [(parallel [(set (match_operand 0 "" "")
+		   (call (match_operand 1 "" "")
+			 (match_operand 2 "" "")))
+	      (clobber (reg:HI RA_REGNO))])]
+  ""
+  "")
+
+(define_insn "*call"
+  [(call (mem:SI (match_operand:SI 0 "call_operand" "i,r"))
+	 (match_operand 1 "" ""))
+   (clobber (reg:HI RA_REGNO))]
+  ""
+  "@
+    call\\t%%label(%0)
+    call\\t%0"
+  [(set_attr "type" "control")])
+
+(define_insn "*call_value"
+  [(set (match_operand 0 "" "")
+	(call (mem:SI (match_operand:SI 1 "call_operand" "i,r"))
+	      (match_operand 2 "" "")))
+   (clobber (reg:HI RA_REGNO))]
+  ""
+  "@
+    call\\t%%label(%1)
+    call\\t%1"
+  [(set_attr "type" "control")])
+
+(define_expand "sibcall"
+  [(parallel [(call (match_operand 0 "" "")
+		    (match_operand 1 "" ""))
+	      (return)])]
+  ""
+  "")
+
+(define_expand "sibcall_value"
+  [(parallel [(set (match_operand 0 "" "")
+		   (call (match_operand 1 "" "")
+			 (match_operand 2 "" "")))
+	      (return)])]
+  ""
+  "")
+
+(define_insn "*sibcall"
+ [(call (mem:SI (match_operand:SI 0 "call_operand" "i,j"))
+	(match_operand 1 "" ""))
+  (return)]
+  "SIBLING_CALL_P (insn)"
+  "@
+    jmp\\t%%label(%0)
+    jmp\\t%0"
+  [(set_attr "type" "control")])
+
+(define_insn "*sibcall_value"
+ [(set (match_operand 0 "register_operand" "")
+       (call (mem:SI (match_operand:SI 1 "call_operand" "i,j"))
+	     (match_operand 2 "" "")))
+  (return)]
+  "SIBLING_CALL_P (insn)"
+  "@
+    jmp\\t%%label(%1)
+    jmp\\t%1"
+  [(set_attr "type" "control")])
+
+(define_insn "*tablejump"
+  [(set (pc)
+	(match_operand:SI 0 "register_operand" "r"))
+   (use (label_ref (match_operand 1 "" "")))]
+  ""
+  "jmp\\t%0"
+  [(set_attr "type" "control")])
+
+;; cbranch pattern.
+;;
+;; NOTE: The short branch check has no typo! We must be conservative and take
+;; into account the worst case of having a signed comparison with a
+;; "far taken branch" label, which amounts to 7 instructions.
+
+(define_insn "cbranch<mode>4"
+  [(set (pc)
+     (if_then_else
+       (match_operator 0 "ordered_comparison_operator"
+	 [(match_operand:QISI 1 "register_operand" "r,r,r")
+	  (match_operand:QISI 2 "reg_or_ubyte_operand" "r,Z,I")])
+       (label_ref (match_operand 3 "" ""))
+       (pc)))]
+  ""
+{
+  const int length = (get_attr_length (insn));
+  const bool is_near = (length == 20 || length == 4);
+
+  if (pru_signed_cmp_operator (operands[0], VOIDmode))
+    {
+      enum rtx_code code = GET_CODE (operands[0]);
+
+      if (which_alternative == 0)
+	return pru_output_signed_cbranch (operands, is_near);
+      else if (which_alternative == 1 && (code == LT || code == GE))
+	return pru_output_signed_cbranch_zeroop2 (operands, is_near);
+      else
+	return pru_output_signed_cbranch_ubyteop2 (operands, is_near);
+    }
+  else
+    {
+      /* PRU demands OP1 to be immediate, so swap operands.  */
+      if (is_near)
+	return "qb%P0\t%l3, %1, %2";
+      else
+	return "qb%Q0\t.+8, %1, %2\;jmp\t%%label(%l3)";
+    }
+}
+  [(set_attr "type" "control")
+   (set (attr "length")
+	(if_then_else
+	    (and (ge (minus (match_dup 3) (pc)) (const_int -2020))
+		 (le (minus (match_dup 3) (pc)) (const_int 2016)))
+	    (if_then_else
+		(match_test "pru_signed_cmp_operator (operands[0], VOIDmode)")
+		    (const_int 20)
+		    (const_int 4))
+	    (if_then_else
+		(match_test "pru_signed_cmp_operator (operands[0], VOIDmode)")
+		    (const_int 28)
+		    (const_int 8))))])
+
+
+(define_expand "cbranch<mode>4"
+  [(set (pc)
+	(if_then_else (match_operator 0 "pru_fp_comparison_operator"
+		       [(match_operand:SFDF 1 "register_operand" "")
+			(match_operand:SFDF 2 "register_operand" "")])
+		      (label_ref (match_operand 3 "" ""))
+		      (pc)))]
+  ""
+{
+  rtx t = pru_expand_fp_compare (operands[0], VOIDmode);
+  operands[0] = t;
+  operands[1] = XEXP (t, 0);
+  operands[2] = XEXP (t, 1);
+})
+
+;
+; Bit test branch
+
+(define_code_iterator BIT_TEST  [eq ne])
+(define_code_attr qbbx_op   [(eq "qbbc") (ne "qbbs")])
+(define_code_attr qbbx_negop   [(eq "qbbs") (ne "qbbc")])
+
+(define_insn "cbranch_qbbx_<BIT_TEST:code><EQS0:mode><EQS1:mode><EQD:mode>4"
+ [(set (pc)
+   (if_then_else
+    (BIT_TEST (zero_extract:EQD
+	 (match_operand:EQS0 0 "register_operand" "r")
+	 (const_int 1)
+	 (match_operand:EQS1 1 "reg_or_ubyte_operand" "rI"))
+     (const_int 0))
+    (label_ref (match_operand 2 "" ""))
+    (pc)))]
+  ""
+{
+  const int length = (get_attr_length (insn));
+  const bool is_near = (length == 4);
+  if (is_near)
+    return "<BIT_TEST:qbbx_op>\\t%l2, %0, %1";
+  else
+    return "<BIT_TEST:qbbx_negop>\\t.+8, %0, %1\;jmp\\t%%label(%l2)";
+}
+  [(set_attr "type" "control")
+   (set (attr "length")
+      (if_then_else
+	  (and (ge (minus (match_dup 2) (pc)) (const_int -2048))
+	       (le (minus (match_dup 2) (pc)) (const_int 2044)))
+	  (const_int 4)
+	  (const_int 8)))])
+
+;; ::::::::::::::::::::
+;; ::
+;; :: Low Overhead Looping - idea "borrowed" from MEP
+;; ::
+;; ::::::::::::::::::::
+
+;; This insn is volatile because we'd like it to stay in its original
+;; position, just before the loop header.  If it stays there, we might
+;; be able to convert it into a "loop" insn.
+(define_insn "doloop_begin_internal<mode>"
+  [(set (match_operand:HISI 0 "register_operand" "=r")
+	(unspec_volatile:HISI
+	 [(match_operand:HISI 1 "reg_or_ubyte_operand" "rI")
+	  (match_operand 2 "const_int_operand" "")] UNSPECV_LOOP_BEGIN))]
+  ""
+  { gcc_unreachable (); }
+  [(set_attr "length" "4")])
+
+(define_expand "doloop_begin"
+  [(use (match_operand 0 "register_operand" ""))
+   (use (match_operand 1 "" ""))]
+  "TARGET_OPT_LOOP"
+  "pru_emit_doloop (operands, 0);
+   DONE;
+  ")
+
+; Note: "JUMP_INSNs and CALL_INSNs are not allowed to have any output
+; reloads;". Hence this insn must be prepared for a counter that is
+; not a register.
+(define_insn "doloop_end_internal<mode>"
+  [(set (pc)
+	(if_then_else (ne (match_operand:HISI 0 "nonimmediate_operand" "+r,*m")
+			  (const_int 1))
+		      (label_ref (match_operand 1 "" ""))
+		      (pc)))
+   (set (match_dup 0)
+	(plus:HISI (match_dup 0)
+		 (const_int -1)))
+   (unspec [(match_operand 2 "const_int_operand" "")] UNSPECV_LOOP_END)
+   (clobber (match_scratch:HISI 3 "=X,&r"))]
+  ""
+  { gcc_unreachable (); }
+  ;; Worst case length:
+  ;;
+  ;;      sub <op3>, 1		4
+  ;;      qbeq .+8, <op3>, 0    4
+  ;;      jmp <op1>		4
+  [(set (attr "length")
+      (if_then_else
+	  (and (ge (minus (pc) (match_dup 1)) (const_int 0))
+	       (le (minus (pc) (match_dup 1)) (const_int 1020)))
+	  (const_int 4)
+	  (const_int 12)))])
+
+(define_expand "doloop_end"
+  [(use (match_operand 0 "nonimmediate_operand" ""))
+   (use (label_ref (match_operand 1 "" "")))]
+  "TARGET_OPT_LOOP"
+  "if (GET_CODE (operands[0]) == REG && GET_MODE (operands[0]) == QImode)
+     FAIL;
+   pru_emit_doloop (operands, 1);
+   DONE;
+  ")
+
+(define_insn "pruloop<mode>"
+  [(set (reg:HISI LOOPCNTR_REG)
+	(unspec:HISI [(match_operand:HISI 0 "reg_or_ubyte_operand" "rI")
+		    (label_ref (match_operand 1 "" ""))]
+		   UNSPECV_LOOP_BEGIN))]
+  ""
+  "loop\\t%l1, %0"
+  [(set_attr "length" "4")])
+
+(define_insn "pruloop_end"
+  [(unspec [(const_int 0)] UNSPECV_LOOP_END)]
+  ""
+  "# loop end"
+  [(set_attr "length" "0")])
+
+
+;;  Misc patterns
+
+(define_insn "delay_cycles_start"
+  [(unspec_volatile [(match_operand 0 "immediate_operand" "i")]
+		    UNSPECV_DELAY_CYCLES_START)]
+  ""
+  "/* Begin %0 cycle delay.  */"
+)
+
+(define_insn "delay_cycles_end"
+  [(unspec_volatile [(match_operand 0 "immediate_operand" "i")]
+		    UNSPECV_DELAY_CYCLES_END)]
+  ""
+  "/* End %0 cycle delay.  */"
+)
+
+
+(define_insn "delay_cycles_2x_plus1_hi"
+  [(unspec_volatile [(match_operand 0 "const_uhword_operand" "J")]
+		    UNSPECV_DELAY_CYCLES_2X_HI)
+   (clobber (match_scratch:SI 1 "=&r"))]
+  ""
+  "ldi\\t%1, %0\;sub\\t%1, %1, 1\;qbne\\t.-4, %1, 0"
+  [(set_attr "length" "12")])
+
+
+; Do not use LDI32 here because we do not want
+; to accidentally loose one instruction cycle.
+(define_insn "delay_cycles_2x_plus2_si"
+  [(unspec_volatile [(match_operand:SI 0 "const_int_operand" "n")]
+		    UNSPECV_DELAY_CYCLES_2X_SI)
+   (clobber (match_scratch:SI 1 "=&r"))]
+  ""
+  "ldi\\t%1.w0, %L0\;ldi\\t%1.w2, %H0\;sub\\t%1, %1, 1\;qbne\\t.-4, %1, 0"
+  [(set_attr "length" "16")])
+
+(define_insn "delay_cycles_1"
+  [(unspec_volatile [(const_int 0) ] UNSPECV_DELAY_CYCLES_1)]
+  ""
+  "nop\\t# delay_cycles_1"
+)
+
+
+(define_insn "nop"
+  [(const_int 0)]
+  ""
+  "nop"
+  [(set_attr "type" "alu")])
+
+(define_insn "nop_loop_guard"
+  [(const_int 0)]
+  ""
+  "nop\\t# Loop end guard"
+  [(set_attr "type" "alu")])
diff --git a/gcc/config/pru/pru.opt b/gcc/config/pru/pru.opt
new file mode 100644
index 00000000000..e17d92fe1d3
--- /dev/null
+++ b/gcc/config/pru/pru.opt
@@ -0,0 +1,53 @@ 
+; Options for the TI PRU port of the compiler.
+; Copyright (C) 2018 Free Software Foundation, Inc.
+; Contributed by Dimitar Dimitrov <dimitar@dinux.eu>
+;
+; This file is part of GCC.
+;
+; GCC is free software; you can redistribute it and/or modify
+; it under the terms of the GNU General Public License as published by
+; the Free Software Foundation; either version 3, or (at your option)
+; any later version.
+;
+; GCC is distributed in the hope that it will be useful,
+; but WITHOUT ANY WARRANTY; without even the implied warranty of
+; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+; GNU General Public License for more details.
+;
+; You should have received a copy of the GNU General Public License
+; along with GCC; see the file COPYING3.  If not see
+; <http://www.gnu.org/licenses/>.
+
+HeaderInclude
+config/pru/pru-opts.h
+
+minrt
+Target Report Mask(MINRT) RejectNegative
+Use a minimum runtime (no static initializers or ctors) for memory-constrained
+devices.
+
+mmcu=
+Target RejectNegative Joined
+-mmcu=MCU	Select the target System-On-Chip variant that embeds this PRU.
+
+mno-relax
+Target Report
+Prevent relaxation of LDI32 relocations.
+
+mloop
+Target Mask(OPT_LOOP)
+Allow (or do not allow) gcc to use the LOOP instruction.
+
+mabi=
+Target RejectNegative Report Joined Enum(pru_abi_t) Var(pru_current_abi) Init(PRU_ABI_GNU) Save
+Select target ABI variant.
+
+Enum
+Name(pru_abi_t) Type(enum pru_abi)
+ABI variant code generation (for use with -mabi= option):
+
+EnumValue
+Enum(pru_abi_t) String(gnu) Value(PRU_ABI_GNU)
+
+EnumValue
+Enum(pru_abi_t) String(ti) Value(PRU_ABI_TI)
diff --git a/gcc/config/pru/t-pru b/gcc/config/pru/t-pru
new file mode 100644
index 00000000000..ab29e443882
--- /dev/null
+++ b/gcc/config/pru/t-pru
@@ -0,0 +1,31 @@ 
+# Makefile fragment for building GCC for the TI PRU target.
+# Copyright (C) 2012-2018 Free Software Foundation, Inc.
+# Contributed by Dimitar Dimitrov <dimitar.dinux.eu>
+# Based on the t-nios2
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published
+# by the Free Software Foundation; either version 3, or (at your
+# option) any later version.
+#
+# GCC is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See
+# the GNU General Public License for more details.
+#
+# You should have received a copy of the  GNU General Public
+# License along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+# Unfortunately mabi=ti is not feature-complete enough to build newlib.
+# Hence we cannot present mabi=gnu/ti as a multilib option.
+
+pru-pragma.o: $(srcdir)/config/pru/pru-pragma.c $(RTL_H) $(TREE_H) \
+		$(CONFIG_H) $(TM_H) $(srcdir)/config/pru/pru-protos.h
+	$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $<
+
+pru-passes.o: $(srcdir)/config/pru/pru-passes.c $(RTL_H) $(TREE_H) \
+		$(CONFIG_H) $(TM_H) $(srcdir)/config/pru/pru-protos.h
+	$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $<
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index e0a84b8b3c5..29ebb086f50 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -21535,6 +21535,7 @@  for further explanation.
 * ARM Pragmas::
 * M32C Pragmas::
 * MeP Pragmas::
+* PRU Pragmas::
 * RS/6000 and PowerPC Pragmas::
 * S/390 Pragmas::
 * Darwin Pragmas::
@@ -21686,6 +21687,25 @@  extern int foo ();
 
 @end table
 
+@node PRU Pragmas
+@subsection PRU Pragmas
+
+@table @code
+
+@item ctable_entry @var{index} @var{constant_address}
+@cindex pragma, ctable_entry
+Specifies that the given PRU CTABLE entry at @var{index} has a value
+@var{constant_address}. This enables GCC to emit LBCO/SBCO instructions
+when the load/store address is known and can be addressed with some CTABLE
+entry.  Example:
+
+@smallexample
+#pragma ctable_entry 2 0x4802a000
+*(unsigned int *)0x4802a010 = val; /* will compile to "lbco Rx, 2, 0x10, 4" */
+@end smallexample
+
+@end table
+
 @node RS/6000 and PowerPC Pragmas
 @subsection RS/6000 and PowerPC Pragmas
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 419980ff86e..af17e72bef1 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1020,6 +1020,10 @@  See RS/6000 and PowerPC Options.
 -mstack-protector-guard=@var{guard} -mstack-protector-guard-reg=@var{reg} @gol
 -mstack-protector-guard-offset=@var{offset}}
 
+@emph{PRU Options}
+@gccoptlist{-mmcu=@var{mcu}  -minrt -mno-relax -mloop @gol
+-mabi=@var{variant} @gol}
+
 @emph{RISC-V Options}
 @gccoptlist{-mbranch-cost=@var{N-instruction} @gol
 -mplt  -mno-plt @gol
@@ -14439,6 +14443,7 @@  platform.
 * picoChip Options::
 * PowerPC Options::
 * PowerPC SPE Options::
+* PRU Options::
 * RISC-V Options::
 * RL78 Options::
 * RS/6000 and PowerPC Options::
@@ -22914,6 +22919,56 @@  the offset with a symbol reference to a canary in the TLS block.
 @end table
 
 
+@node PRU Options
+@subsection PRU Options
+@cindex PRU Options
+
+These command-line options are defined for PRU target:
+
+@table @gcctabopt
+@item -minrt
+@opindex minrt
+Enable the use of a minimum runtime environment - no static
+initializers or constructors.  Results in significant code size
+reduction of final ELF binaries.
+
+@item -mmcu=@var{mcu}
+@opindex mmcu
+Specify the PRU MCU variant to use.  Check newlib for exact list of options.
+
+@item -mno-relax
+@opindex mno-relax
+Pass on (or do not pass on) the @option{-mrelax} command-line option
+to the linker.
+
+@item -mloop
+@opindex mloop
+Allow (or do not allow) gcc to use the LOOP instruction.
+
+@item -mabi=@var{variant}
+@opindex mabi
+Specify the ABI variant to output code for.  Permissible values are @samp{gnu}
+for GCC, and @samp{ti} for fully conformant TI ABI.  These are the differences:
+
+@table @samp
+@item Function Pointer Size
+TI ABI specifies that function (code) pointers are 16-bit, whereas GCC
+supports only 32-bit data and code pointers.
+
+@item Optional Return Value Pointer
+Function return values larger than 64-bits are passed by using a hidden
+pointer as the first argument of the function.  TI ABI, though, mandates that
+the pointer can be NULL in case the caller is not using the returned value.
+GCC always passes and expects a valid return value pointer.
+
+@end table
+
+The current @samp{mabi=ti} implementation will simply raise a compile error
+when any of the above code constructs is detected.
+
+@end table
+
+
 @node RISC-V Options
 @subsection RISC-V Options
 @cindex RISC-V Options
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index a3ecb711eca..c1fa37f97f4 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -3298,6 +3298,28 @@  Vector constant that is all zeros.
 
 @end table
 
+@item PRU---@file{config/pru/constraints.md}
+@table @code
+@item I
+An unsigned 8-bit integer constant.
+
+@item J
+An unsigned 16-bit integer constant.
+
+@item L
+An unsigned 5-bit integer constant (for shift counts).
+
+@item M
+An integer constant in the range [-255;0].
+
+@item T
+A text segment (program memory) constant label.
+
+@item Z
+Integer constant zero.
+
+@end table
+
 @item RL78---@file{config/rl78/constraints.md}
 @table @code