diff mbox series

[v8,04/12] LoongArch Port: Machine description files.

Message ID 20220304071809.3082015-5-xuchenghua@loongson.cn
State New
Headers show
Series Add LoongArch support. | expand

Commit Message

Chenghua Xu March 4, 2022, 7:18 a.m. UTC
From: chenglulu <chenglulu@loongson.cn>

2022-03-04  Chenghua Xu  <xuchenghua@loongson.cn>
	    Lulu Cheng  <chenglulu@loongson.cn>

gcc/
	* config/loongarch/constraints.md: New file.
	* config/loongarch/generic.md: New file.
	* config/loongarch/la464.md: New file.
	* config/loongarch/loongarch-ftypes.def: New file.
	* config/loongarch/loongarch-modes.def: New file.
	* config/loongarch/loongarch.md: New file.
	* config/loongarch/predicates.md: New file.
	* config/loongarch/sync.md: New file.
---
 gcc/config/loongarch/constraints.md       |  204 ++
 gcc/config/loongarch/generic.md           |  132 +
 gcc/config/loongarch/la464.md             |  132 +
 gcc/config/loongarch/loongarch-ftypes.def |  106 +
 gcc/config/loongarch/loongarch-modes.def  |   29 +
 gcc/config/loongarch/loongarch.md         | 3712 +++++++++++++++++++++
 gcc/config/loongarch/predicates.md        |  527 +++
 gcc/config/loongarch/sync.md              |  574 ++++
 8 files changed, 5416 insertions(+)
 create mode 100644 gcc/config/loongarch/constraints.md
 create mode 100644 gcc/config/loongarch/generic.md
 create mode 100644 gcc/config/loongarch/la464.md
 create mode 100644 gcc/config/loongarch/loongarch-ftypes.def
 create mode 100644 gcc/config/loongarch/loongarch-modes.def
 create mode 100644 gcc/config/loongarch/loongarch.md
 create mode 100644 gcc/config/loongarch/predicates.md
 create mode 100644 gcc/config/loongarch/sync.md

Comments

Richard Sandiford March 6, 2022, 4:16 p.m. UTC | #1
Hi,

Some comments below, but otherwise it looks good to me.

xuchenghua@loongson.cn writes:
> […]
> +(define_memory_constraint "k"
> +  "A memory operand whose address is formed by a base register and (optionally scaled)
> +   index register."
> +  (and (match_code "mem")
> +       (not (match_test "loongarch_14bit_shifted_offset_address_p (XEXP (op, 0), mode)"))
> +       (not (match_test "loongarch_12bit_offset_address_p (XEXP (op, 0), mode)"))))

It's not really safe to test MEM addresses using only negative conditions.
There needs to be a positive condition too, even if it's only:

       (match_test "memory_address_addr_space_p (GET_MODE (op), XEXP (op, 0),
						 MEM_ADDR_SPACE (op))")))

(from common.md).

> […]
> +(define_constraint "v"
> +  "A nsigned 64-bit constant and low 44-bit is zero (for logic instructions)."

typo

> […]
> +(define_memory_constraint "ZB"
> +  "@internal
> +  An address that is held in a general-purpose register.
> +  The offset is zero"
> +  (and (match_code "mem")
> +       (match_test "GET_CODE (XEXP (op,0)) == REG")))

It'd be good to use things like REG_P in new code.

Formatting nit: should be a space after “op,”.

> […]
> +;; Pipeline descriptions.
> +;;
> +;; generic.md provides a fallback for processors without a specific
> +;; pipeline description.  It is derived from the old define_function_unit
> +;; version and uses the "alu" and "imuldiv" units declared below.
> +;;
> +;; Some of the processor-specific files are also derived from old
> +;; define_function_unit descriptions and simply override the parts of
> +;; generic.md that don't apply.  The other processor-specific files
> +;; are self-contained.

I don't think these last two paragraphs apply to the new code.
The MIPS generic.md was converted from a much older pipeline
description format.  The conversion meant the older processor
descriptions weren't self-contained and relied on a mixture
of processor-specific things (in their own file) and generic
things (in this file).

New processor descriptions should be self-contained as far
as possible, in terms of not sharing cpu units with other
processor descriptions.

> +(define_automaton "alu,imuldiv")
> +
> +(define_cpu_unit "alu" "alu")
> +(define_cpu_unit "imuldiv" "imuldiv")
> +
> +;; Ghost instructions produce no real code.
> +;; They exist purely to express an effect on dataflow.
> +(define_insn_reservation "ghost" 0
> +  (eq_attr "type" "ghost")
> +  "nothing")
> +
> +;; This file is derived from the old define_function_unit description.
> +;; Each reservation can be overridden on a processor-by-processor basis.

Same for this last comment.

The "ghost" reservation is inherently sharable because it doesn't
use any CPU units.  But for a new port, I think the other reservations
in this file should be conditional on a particular -mtune.

> […]
> diff --git a/gcc/config/loongarch/la464.md b/gcc/config/loongarch/la464.md
> new file mode 100644
> index 00000000000..ae3808b51bb
> --- /dev/null
> +++ b/gcc/config/loongarch/la464.md
> @@ -0,0 +1,132 @@
> +;; Pipeline model for LoongArch LA464 cores.
> +
> +;; Copyright (C) 2021-2022 Free Software Foundation, Inc.
> +;; Contributed by Loongson Ltd.
> +
> +;; This file is part of GCC.
> +;;
> +;; GCC is free software; you can redistribute it and/or modify it
> +;; under the terms of the GNU General Public License as published
> +;; by the Free Software Foundation; either version 3, or (at your
> +;; option) any later version.
> +;;
> +;; GCC is distributed in the hope that it will be useful, but WITHOUT
> +;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
> +;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> +;; License for more details.
> +;;
> +;; You should have received a copy of the GNU General Public License
> +;; along with GCC; see the file COPYING3.  If not see
> +;; <http://www.gnu.org/licenses/>.
> +
> +;; Uncomment the following line to output automata for debugging.
> +;; (automata_option "v")
> +
> +;; Automaton for integer instructions.
> +(define_automaton "la464_a_alu")
> +
> +;; Automaton for floating-point instructions.
> +(define_automaton "la464_a_falu")
> +
> +;; Automaton for memory operations.
> +(define_automaton "la464_a_mem")
> +
> +;; Describe the resources.
> +
> +(define_cpu_unit "la464_alu1" "la464_a_alu")
> +(define_cpu_unit "la464_alu2" "la464_a_alu")
> +(define_cpu_unit "la464_mem1" "la464_a_mem")
> +(define_cpu_unit "la464_mem2" "la464_a_mem")
> +(define_cpu_unit "la464_falu1" "la464_a_falu")
> +(define_cpu_unit "la464_falu2" "la464_a_falu")
> +
> +;; Describe instruction reservations.
> +
> +(define_insn_reservation "la464_arith" 1
> +  (and (match_test "TARGET_ARCH_LA464")

Normally scheduling should be determined by -mtune (with a default
-mtune chosen by -march when no explicit -mtune is given).  So I was
surprised to see this testing TARGET_ARCH_* instead of TARGET_TUNE_*.

> +       (eq_attr "type" "arith,clz,const,logical,
> +			move,nop,shift,signext,slt"))
> +  "la464_alu1 | la464_alu2")
> […]
> +;; Main data type used by the insn
> +(define_attr "mode" "unknown,none,QI,HI,SI,DI,TI,OI,SF,DF,TF,FCC"
> +  (const_string "unknown"))

Do any patterns use mode==OI?  Reason for asking is that:

> […]
> +;; True if the main data type is four times of the size of a word.
> +(define_attr "qword_mode" "no,yes"
> +  (cond [(and (eq_attr "mode" "TI,TF")
> +	      (not (match_test "TARGET_64BIT")))
> +	 (const_string "yes")]
> +	(const_string "no")))
> +
> +;; True if the main data type is eight times of the size of a word.
> +(define_attr "oword_mode" "no,yes"
> +  (cond [(and (eq_attr "mode" "OI")
> +	      (not (match_test "TARGET_64BIT")))
> +	 (const_string "yes")]
> +	(const_string "no")))

…it seemed inconsistent to treat OI is 8 words on 32-bit targets
but not as 4 words (qword_mode) on 64-bit targets.

It looks like oword_mode and OI might not be used, in which case
it's probably easiest to remove them for now.

> +(define_attr "compression" "none,all"
> +  (const_string "none"))

Does anything use this, or is it a holdover from MIPS?  I could see some
definitions of the attribute later in the file, but there didn't seem
to be any users.

> […]
> +;; This mode iterator allows the QI HI SI and DI extension patterns to be

incomplete sentence.

> +(define_mode_iterator QHWD [QI HI SI (DI "TARGET_64BIT")])
> +
> +;; Iterator for hardware-supported floating-point modes.
> +(define_mode_iterator ANYF [(SF "TARGET_HARD_FLOAT")
> +			    (DF "TARGET_DOUBLE_FLOAT")])
> +
> +;; A floating-point mode for which moves involving FPRs may need to be split.

Probably s/floating-point //, given that the iterator includes DI.

> +(define_mode_iterator SPLITF
> +  [(DF "!TARGET_64BIT && TARGET_DOUBLE_FLOAT")
> +   (DI "!TARGET_64BIT && TARGET_DOUBLE_FLOAT")
> +   (TF "TARGET_64BIT && TARGET_DOUBLE_FLOAT")])
> +
> +;; In GPR templates, a string like "mul.<d>" will expand to "mul" in the
> +;; 32-bit "mul.w" and "mul.d" in the 64-bit version.

Maybe:

;; In GPR templates, a string like "mul.<d>" will expand to "mul.w" in the
;; 32-bit version and "mul.d" in the 64-bit version.

> +(define_mode_attr d [(SI "w") (DI "d")])
> +
> +;; This attribute gives the length suffix for a load or store instruction.
> +;; The same suffixes work for zero and sign extensions.
> +(define_mode_attr size [(QI "b") (HI "h") (SI "w") (DI "d")])
> +(define_mode_attr SIZE [(QI "B") (HI "H") (SI "W") (DI "D")])
> +
> +;; This attributes gives the mode mask of a SHORT.

s/attributes/attribute/

> +(define_mode_attr mask [(QI "0x00ff") (HI "0xffff")])
> +
> +;; This attributes gives the size (bits) of a SHORT.

Maybe:

;; This attribute gives the number of bits in a SHORT minus one.

since it's 7,15 rather than 8,16.

> +(define_mode_attr qi_hi [(QI "7") (HI "15")])

Just a suggestion, but it might be worth renaming this.  qi_hi only
describes the range of inputs rather than what the attribute is.

> […]
> +(define_insn "*addsi3_extended"
> +  [(set (match_operand:DI 0 "register_operand" "=r,r")
> +	(sign_extend:DI
> +	     (plus:SI (match_operand:SI 1 "register_operand" "r,r")
> +		      (match_operand:SI 2 "arith_operand" "r,I"))))]
> +  "TARGET_64BIT"
> +  "add%i2.w\t%0,%1,%2"
> +  [(set_attr "alu_type" "add")
> +   (set_attr "mode" "SI")])
> +
> +(define_insn "*addsi3_extended2"
> +  [(set (match_operand:DI 0 "register_operand" "=r,r")
> +	(sign_extend:DI
> +	  (subreg:SI (plus:DI (match_operand:DI 1 "register_operand" "r,r")
> +			      (match_operand:DI 2 "arith_operand"    "r,I"))
> +		     0)))]
> +  "TARGET_64BIT"
> +  "add%i2.w\t%0,%1,%2"
> +  [(set_attr "alu_type" "add")
> +   (set_attr "mode" "SI")])

This is really a question for part 5, but it affects this part too:

I notice the port defines TRULY_NOOP_TRUNCATION to false for DI->SI,
like MIPS does.  Are you sure that's the right trade-off?  It had to
be defined that way for MIPS because 32-bit MIPS instructions gave
undefined results if the input operands weren't in sign-extended form.
For example a 32-bit add with 64-bit register contents:

   0x(ffffffff_)fffffffe
 + 0x(00000000_)00000001

was guaranteed to give -1 (sign-extended) but:

   0x(00000000_)fffffffe
 + 0x(00000000_)00000001

gave an undefined result.

Does Loongson have the same restriction, or do the 32-bit instructions
simply ignore the upper 32 bits?  If Loongson ignores the upper bits
then it would probably be better not to define TRULY_NOOP_TRUNCATION.

If you do that, the subreg pattern above shouldn't be needed; the
subreg should get folded away by target-independent code.

> +;;
> +;;  ....................
> +;;
> +;;	MULTIPLICATION
> +;;
> +;;  ....................
> +;;
> +
> +(define_insn "mul<mode>3"
> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
> +	(mult:ANYF (match_operand:ANYF 1 "register_operand" "f")
> +		   (match_operand:ANYF 2 "register_operand" "f")))]
> +  "TARGET_HARD_FLOAT"

Not a big deal, just noticed that some ANYF patterns (like this one)
add an explicit TARGET_HARD_FLOAT on top of the ANYF conditions:

(define_mode_iterator ANYF [(SF "TARGET_HARD_FLOAT")
			    (DF "TARGET_DOUBLE_FLOAT")])

while other patterns don't.  Both are correct of course, but it might
be good to use one style throughout the file.  At first, when I saw
the pattern above, I thought the TARGET_HARD_FLOAT was missing from
the earlier patterns.

> […]
> +(define_insn "*div<mode>3"
> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
> +	(div:ANYF (match_operand:ANYF 1 "register_operand" "f")
> +		  (match_operand:ANYF 2 "register_operand" "f")))]
> +  "TARGET_HARD_FLOAT"
> +  "fdiv.<fmt>\t%0,%1,%2"
> +  [(set_attr "type" "fdiv")
> +   (set_attr "mode" "<UNITMODE>")
> +   (set_attr "insn_count" "1")])
> +
> +;; In 3A5000, the reciprocal operation is the same as the division operation.
> +
> +(define_insn "*recip<mode>3"
> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
> +	(div:ANYF (match_operand:ANYF 1 "const_1_operand" "")
> +		  (match_operand:ANYF 2 "register_operand" "f")))]
> +  "TARGET_HARD_FLOAT"
> +  "frecip.<fmt>\t%0,%2"
> +  [(set_attr "type" "frdiv")
> +   (set_attr "mode" "<UNITMODE>")
> +   (set_attr "insn_count" "1")])

Very minor, but these insn_counts seem redundant.  (MIPS needed them
due to the SB1 errata workaround.)

> […]
> +;; Floating point multiply accumulate instructions.
> +
> +;; a * b + c
> +(define_expand "fma<mode>4"
> +  [(set (match_operand:ANYF 0 "register_operand")
> +	(fma:ANYF (match_operand:ANYF 1 "register_operand")
> +		  (match_operand:ANYF 2 "register_operand")
> +		  (match_operand:ANYF 3 "register_operand")))]
> +  "TARGET_HARD_FLOAT")
> +
> +(define_insn "*fma<mode>4_madd4"
> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
> +	(fma:ANYF (match_operand:ANYF 1 "register_operand" "f")
> +		  (match_operand:ANYF 2 "register_operand" "f")
> +		  (match_operand:ANYF 3 "register_operand" "f")))]
> +  "TARGET_HARD_FLOAT"
> +  "fmadd.<fmt>\t%0,%1,%2,%3"
> +  [(set_attr "type" "fmadd")
> +   (set_attr "mode" "<UNITMODE>")])

This pair of patterns could be a single define_insn, like for fms.

> […]
> +;; Integer truncation patterns.  Truncating SImode values to smaller
> +;; modes is a no-op, as it is for most other GCC ports.  Truncating
> +;; DImode values to SImode is not a no-op for TARGET_64BIT since we
> +;; need to make sure that the lower 32 bits are properly sign-extended
> +;; (see TARGET_TRULY_NOOP_TRUNCATION).  Truncating DImode values into modes
> +;; smaller than SImode is equivalent to two separate truncations:
> +;;
> +;;			  A       B
> +;;    DI ---> HI  ==  DI ---> SI ---> HI
> +;;    DI ---> QI  ==  DI ---> SI ---> QI
> +;;
> +;; Step A needs a real instruction but step B does not.
> +
> +(define_insn "truncdi<mode>2"
> +  [(set (match_operand:SUBDI 0 "nonimmediate_operand" "=r,m,k")
> +	(truncate:SUBDI (match_operand:DI 1 "register_operand" "r,r,r")))]
> +  "TARGET_64BIT"
> +  "@
> +    slli.w\t%0,%1,0
> +    st.<size>\t%1,%0
> +    stx.<size>\t%1,%0"
> +  [(set_attr "move_type" "sll0,store,store")
> +   (set_attr "mode" "SI")])

Following on from the above, this pattern wouldn't be needed
if TRULY_NOOP_TRUNCATION can be left undefined.

> […]
> +;; Combiner patterns to optimize shift/truncate combinations.
> +
> +(define_insn "*ashr_trunc<mode>"
> +  [(set (match_operand:SUBDI 0 "register_operand" "=r")
> +	(truncate:SUBDI
> +	  (ashiftrt:DI (match_operand:DI 1 "register_operand" "r")
> +		       (match_operand:DI 2 "const_arith_operand" ""))))]
> +  "TARGET_64BIT && IN_RANGE (INTVAL (operands[2]), 32, 63)"
> +  "srai.d\t%0,%1,%2"
> +  [(set_attr "type" "shift")
> +   (set_attr "mode" "<MODE>")])
> +
> +(define_insn "*lshr32_trunc<mode>"
> +  [(set (match_operand:SUBDI 0 "register_operand" "=r")
> +	(truncate:SUBDI
> +	  (lshiftrt:DI (match_operand:DI 1 "register_operand" "r")
> +		       (const_int 32))))]
> +  "TARGET_64BIT"
> +  "srai.d\t%0,%1,32"
> +  [(set_attr "type" "shift")
> +   (set_attr "mode" "<MODE>")])

Same for these.

> +
> +;;
> +;;  ....................
> +;;
> +;;	ZERO EXTENSION
> +;;
> +;;  ....................
> +
> +(define_insn "zero_extendsidi2"
> +  [(set (match_operand:DI 0 "register_operand" "=r,r,r,r")
> +	(zero_extend:DI (match_operand:SI 1 "nonimmediate_operand" "r,ZC,m,k")))]
> +  "TARGET_64BIT"
> +  "@
> +   bstrpick.d\t%0,%1,31,0
> +   ldptr.w\t%0,%1\n\tlu32i.d\t%0,0

FWIW, splitting this after reload would give more scheduling freedom,
but it's not necessary to do that for the submission.

> +   ld.wu\t%0,%1
> +   ldx.wu\t%0,%1"
> +  [(set_attr "move_type" "arith,load,load,load")
> +   (set_attr "mode" "DI")
> +   (set_attr "insn_count" "1,2,1,1")])
> +
> […]
> +;; Combiner patterns to optimize truncate/zero_extend combinations.
> +
> +(define_insn "*zero_extend<GPR:mode>_trunc<SHORT:mode>"
> +  [(set (match_operand:GPR 0 "register_operand" "=r")
> +	(zero_extend:GPR
> +	    (truncate:SHORT (match_operand:DI 1 "register_operand" "r"))))]
> +  "TARGET_64BIT"
> +  "bstrpick.w\t%0,%1,<SHORT:qi_hi>,0"
> +  [(set_attr "move_type" "pick_ins")
> +   (set_attr "mode" "<GPR:MODE>")])
> +
> +(define_insn "*zero_extendhi_truncqi"
> +  [(set (match_operand:HI 0 "register_operand" "=r")
> +	(zero_extend:HI
> +	    (truncate:QI (match_operand:DI 1 "register_operand" "r"))))]
> +  "TARGET_64BIT"
> +  "andi\t%0,%1,0xff"
> +  [(set_attr "alu_type" "and")
> +   (set_attr "mode" "HI")])
> +
> +;;
> +;;  ....................
> +;;
> +;;	SIGN EXTENSION
> +;;
> +;;  ....................
> +
> +;; Extension insns.
> +;; Those for integer source operand are ordered widest source type first.
> +
> +;; When TARGET_64BIT, all SImode integer should already be in sign-extended
> +;; form (see TARGET_TRULY_NOOP_TRUNCATION and truncdisi2).  We can therefore
> +;; get rid of register->register instructions if we constrain the source to
> +;; be in the same register as the destination.
> +;;
> +;; Only the pre-reload scheduler sees the type of the register alternatives;
> +;; we split them into nothing before the post-reload scheduler runs.
> +;; These alternatives therefore have type "move" in order to reflect
> +;; what happens if the two pre-reload operands cannot be tied, and are
> +;; instead allocated two separate GPRs.
> +(define_insn_and_split "extendsidi2"
> +  [(set (match_operand:DI 0 "register_operand" "=r,r,r,r")
> +	(sign_extend:DI (match_operand:SI 1 "nonimmediate_operand" "0,ZC,m,k")))]
> +  "TARGET_64BIT"
> +  "@
> +   #
> +   ldptr.w\t%0,%1
> +   ld.w\t%0,%1
> +   ldx.w\t%0,%1"
> +  "&& reload_completed && register_operand (operands[1], VOIDmode)"
> +  [(const_int 0)]
> +{
> +  emit_note (NOTE_INSN_DELETED);
> +  DONE;
> +}
> +  [(set_attr "move_type" "move,load,load,load")
> +   (set_attr "mode" "DI")])

The three patterns above would also look different without
the definition of TRULY_NOOP_TRUNCATION.

> […]
> +(define_insn "*extenddi_truncate<mode>"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> +	(sign_extend:DI
> +	    (truncate:SHORT (match_operand:DI 1 "register_operand" "r"))))]
> +  "TARGET_64BIT"
> +  "ext.w.<size>\t%0,%1"
> +  [(set_attr "move_type" "signext")
> +   (set_attr "mode" "DI")])
> +
> +(define_insn "*extendsi_truncate<mode>"
> +  [(set (match_operand:SI 0 "register_operand" "=r")
> +	(sign_extend:SI
> +	    (truncate:SHORT (match_operand:DI 1 "register_operand" "r"))))]
> +  "TARGET_64BIT"
> +  "ext.w.<size>\t%0,%1"
> +  [(set_attr "move_type" "signext")
> +   (set_attr "mode" "SI")])
> +
> +(define_insn "*extendhi_truncateqi"
> +  [(set (match_operand:HI 0 "register_operand" "=r")
> +	(sign_extend:HI
> +	    (truncate:QI (match_operand:DI 1 "register_operand" "r"))))]
> +  "TARGET_64BIT"
> +  "ext.w.b\t%0,%1"
> +  [(set_attr "move_type" "signext")
> +   (set_attr "mode" "SI")])

Same for these.

> +
> +(define_insn "extendsfdf2"
> +  [(set (match_operand:DF 0 "register_operand" "=f")
> +	(float_extend:DF (match_operand:SF 1 "register_operand" "f")))]
> +  "TARGET_DOUBLE_FLOAT"
> +  "fcvt.d.s\t%0,%1"
> +  [(set_attr "type" "fcvt")
> +   (set_attr "cnv_mode"	"S2D")
> +   (set_attr "mode" "DF")])

Very minor, but it's probably better to use a space rather than a tab
after "cnv_mode", for consistency with the other attributes.  Same in
the rest of the file.

> +;; floating point value by converting to value to an unsigned integer

typo, maybe:

;; Convert a floating-point value to an unsigned integer.

> +
> +(define_expand "fixuns_truncdfsi2"
> +  [(set (match_operand:SI 0 "register_operand")
> +	(unsigned_fix:SI (match_operand:DF 1 "register_operand")))]
> +  "TARGET_DOUBLE_FLOAT"
> +{
> +  rtx reg1 = gen_reg_rtx (DFmode);
> +  rtx reg2 = gen_reg_rtx (DFmode);
> +  rtx reg3 = gen_reg_rtx (SImode);
> +  rtx_code_label *label1 = gen_label_rtx ();
> +  rtx_code_label *label2 = gen_label_rtx ();
> +  rtx test;
> +  REAL_VALUE_TYPE offset;
> +
> +  real_2expN (&offset, 31, DFmode);
> +
> +  if (reg1)		      /* Turn off complaints about unreached code.  */
> +    {

I think we can drop this workaround.  IIRC it was needed for the
MIPS IRIX compilers, or maybe it was some ancient version of GCC.

> +      loongarch_emit_move (reg1,
> +			   const_double_from_real_value (offset, DFmode));
> +      do_pending_stack_adjust ();
> +
> +      test = gen_rtx_GE (VOIDmode, operands[1], reg1);
> +      emit_jump_insn (gen_cbranchdf4 (test, operands[1], reg1, label1));
> +
> +      emit_insn (gen_fix_truncdfsi2 (operands[0], operands[1]));
> +      emit_jump_insn (gen_rtx_SET (pc_rtx,
> +				   gen_rtx_LABEL_REF (VOIDmode, label2)));
> +      emit_barrier ();
> +
> +      emit_label (label1);
> +      loongarch_emit_move (reg2, gen_rtx_MINUS (DFmode, operands[1], reg1));
> +      loongarch_emit_move (reg3, GEN_INT (trunc_int_for_mode
> +				     (BITMASK_HIGH, SImode)));
> +
> +      emit_insn (gen_fix_truncdfsi2 (operands[0], reg2));
> +      emit_insn (gen_iorsi3 (operands[0], operands[0], reg3));
> +
> +      emit_label (label2);
> +
> +      /* Allow REG_NOTES to be set on last insn (labels don't have enough
> +	 fields, and can't be used for REG_NOTES anyway).  */
> +      emit_use (stack_pointer_rtx);
> +      DONE;
> +    }
> +})
> […]
> +;; Allow combine to split complex const_int load sequences, using operand 2
> +;; to store the intermediate results.  See move_operand for details.
> +(define_split
> +  [(set (match_operand:GPR 0 "register_operand")
> +	(match_operand:GPR 1 "splittable_const_int_operand"))
> +   (clobber (match_operand:GPR 2 "register_operand"))]
> +  ""
> +  [(const_int 0)]
> +{
> +  loongarch_move_integer (operands[2], operands[0], INTVAL (operands[1]));
> +  DONE;
> +})

Does this define_split trigger?  I couldn't see an associated
define_insn that would provide the clobber.

> +(define_insn "*movsi_internal"
> +  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r,w,*f,*f,*r,*m,*r,*z")
> +	(match_operand:SI 1 "move_operand" "r,Yd,w,rJ,*r*J,*m,*f,*f,*z,*r"))]
> +  "(register_operand (operands[0], SImode)
> +       || reg_or_0_operand (operands[1], SImode))"

Minor formatting nit: too much indentantion on the line above.  Same for
the other moves.

> […]
> +;; Conditional move instructions.
> +
> +(define_insn "*sel<code><GPR:mode>_using_<GPR2:mode>"
> +  [(set (match_operand:GPR 0 "register_operand" "=r,r")
> +	(if_then_else:GPR
> +	 (equality_op:GPR2 (match_operand:GPR2 1 "register_operand" "r,r")
> +			   (const_int 0))
> +	 (match_operand:GPR 2 "reg_or_0_operand" "r,J")
> +	 (match_operand:GPR 3 "reg_or_0_operand" "J,r")))]
> +  "register_operand (operands[2], <GPR:MODE>mode)
> +       != register_operand (operands[3], <GPR:MODE>mode)"

Same here.

> +  "@
> +   <sel>\t%0,%2,%1
> +   <selinv>\t%0,%3,%1"
> +  [(set_attr "type" "condmove")
> +   (set_attr "mode" "<GPR:MODE>")])
> +
> +;; sel.fmt copies the 3rd argument when the 1st is non-zero and the 2nd

s/sel.fmt/fsel/

> +;; argument if the 1st is zero.  This means operand 2 and 3 are
> +;; inverted in the instruction.
> +
> +(define_insn "*sel<mode>"
> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
> +	(if_then_else:ANYF
> +	 (ne:FCC (match_operand:FCC 1 "register_operand" "z")
> +		 (const_int 0))
> +	 (match_operand:ANYF 2 "reg_or_0_operand" "f")
> +	 (match_operand:ANYF 3 "reg_or_0_operand" "f")))]
> +  "TARGET_HARD_FLOAT"
> +  "fsel\t%0,%3,%2,%1"
> +  [(set_attr "type" "condmove")
> +   (set_attr "mode" "<ANYF:MODE>")])
> […]
> +(define_insn "lu52i_d"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> +	(ior:DI
> +	  (and:DI (match_operand:DI 1 "register_operand" "r")
> +		  (match_operand 2 "lu52i_mask_operand"))
> +	  (match_operand 3 "const_lu52i_operand" "v")))]
> +    "TARGET_64BIT"
> +    "lu52i.d\t%0,%1,%X3>>52"
> +    [(set_attr "type" "arith")
> +     (set_attr "mode" "DI")])

Formatting nit: too much indentation from "TARGET_64BIT" onwards.

> +
> +;; Convert floating-point numbers to integers
> +(define_insn "frint_<fmt>"
> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
> +	(unspec:ANYF [(match_operand:ANYF 1 "register_operand" "f")]
> +		      UNSPEC_FRINT))]
> +  "TARGET_HARD_FLOAT"
> +  "frint.<fmt>\t%0,%1"
> +  [(set_attr "type" "fcvt")
> +   (set_attr "mode" "<MODE>")])
> +
> +;; LoongArch supports loading and storing a floating point register from
> +;; the sum of two general-purpose registers.  We use two versions for each of
> +;; these four instructions: one where the two general-purpose registers are
> +;; SImode, and one where they are DImode.  This is because general-purpose
> +;; registers will be in SImode when they hold 32-bit values, but,
> +;; since the 32-bit values are always sign extended, the f{ld/st}x.{s/d}
> +;; instructions will still work correctly.
> +
> +;; ??? Perhaps it would be better to support these instructions by
> +;; modifying TARGET_LEGITIMATE_ADDRESS_P and friends.  However, since
> +;; these instructions can only be used to load and store floating
> +;; point registers, that would probably cause trouble in reload.

Does this comment apply to Loongson?  It looks from:

  +;; Similarly for LoongArch indexed GPR loads and stores.
  +(define_mode_attr loadx [(QI "ldx.b")
  +			 (HI "ldx.h")
  +			 (SI "ldx.w")
  +			 (DI "ldx.d")])
  +(define_mode_attr storex [(QI "stx.b")
  +			  (HI "stx.h")
  +			  (SI "stx.w")
  +			  (DI "stx.d")])

like Loongson has a full set of indexed loads and stores, so supporting
indexed addresses in TARGET_LEGITIMATE_ADDRESS_P should work.

Also, the MIPS comment predates LRA, which is better than old reload
at handling irregular memory address requirements.  Accepting indexed
addresses in TARGET_LEGITIMATE_ADDRESS_P would allow ivopts to optimise
the code better.

> […]
> +;; Expand in-line code to clear the instruction cache between operand[0] and
> +;; operand[1].
> +(define_expand "clear_cache"
> +  [(match_operand 0 "pmode_register_operand")
> +   (match_operand 1 "pmode_register_operand")]
> +  ""
> +  "
> +{
> +  emit_insn (gen_ibar (const0_rtx));
> +  DONE;
> +}")

Minor nit: the quotes before { and after } are unnecessary.

> […]
> +(define_insn "asrtle_d"
> +	[(unspec_volatile:DI [(match_operand:DI 0 "register_operand" "r")
> +			      (match_operand:DI 1 "register_operand" "r")]
> +			      UNSPECV_ASRTLE_D)]
> +  "TARGET_64BIT"
> +  "asrtle.d\t%0,%1"
> +  [(set_attr "type" "load")
> +   (set_attr "mode" "DI")])
> +
> +(define_insn "asrtgt_d"
> +	[(unspec_volatile:DI [(match_operand:DI 0 "register_operand" "r")
> +			      (match_operand:DI 1 "register_operand" "r")]
> +			      UNSPECV_ASRTGT_D)]
> +  "TARGET_64BIT"
> +  "asrtgt.d\t%0,%1"
> +  [(set_attr "type" "load")
> +   (set_attr "mode" "DI")])

Formatting: the unspec_volatile pattern is indented too far in both cases.

> […]
> +;; The following templates were added to generate "bstrpick.d + alsl.d"
> +;; instruction pairs.
> +;; It is required that the values of const_immalsl_operand and
> +;; immediate_operand must have the following correspondence:
> +;;
> +;; (immediate_operand >> const_immalsl_operand) == 0xffffffff
> +
> +(define_insn "zero_extend_ashift1"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> +	(and:DI (ashift:DI (subreg:DI (match_operand:SI 1 "register_operand" "r") 0)
> +			   (match_operand 2 "const_immalsl_operand" ""))
> +		(match_operand 3 "immediate_operand" "")))]
> +  "TARGET_64BIT
> +   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)"
> +  "bstrpick.d\t%0,%1,31,0\n\talsl.d\t%0,%0,$r0,%2"
> +  [(set_attr "type" "arith")
> +   (set_attr "mode" "DI")
> +   (set_attr "insn_count" "2")])

Without the TRULY_NOOP_TRUNCATION definition, this pattern would be
redundant with…

> +(define_insn "zero_extend_ashift2"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> +	(and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r")
> +			   (match_operand 2 "const_immalsl_operand" ""))
> +		(match_operand 3 "immediate_operand" "")))]
> +  "TARGET_64BIT
> +   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)"
> +  "bstrpick.d\t%0,%1,31,0\n\talsl.d\t%0,%0,$r0,%2"
> +  [(set_attr "type" "arith")
> +   (set_attr "mode" "DI")
> +   (set_attr "insn_count" "2")])

…this one.  I'm surprised it isn't already TBH.  Doesn't the subreg match:

  (match_operand:DI 1 "register_operand" "r")

?

> +
> +(define_insn "alsl_paired1"
> +  [(set (match_operand:DI 0 "register_operand" "=&r")
> +	(plus:DI (and:DI (ashift:DI (subreg:DI (match_operand:SI 1 "register_operand" "r") 0)
> +				    (match_operand 2 "const_immalsl_operand" ""))
> +			 (match_operand 3 "immediate_operand" ""))
> +		 (match_operand:DI 4 "register_operand" "r")))]
> +  "TARGET_64BIT
> +   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)"
> +  "bstrpick.d\t%0,%1,31,0\n\talsl.d\t%0,%0,%4,%2"
> +  [(set_attr "type" "arith")
> +  (set_attr "mode" "DI")
> +  (set_attr "insn_count" "2")])
> +
> +(define_insn "alsl_paired2"
> +  [(set (match_operand:DI 0 "register_operand" "=&r")
> +	(plus:DI (match_operand:DI 1 "register_operand" "r")
> +		 (and:DI (ashift:DI (match_operand:DI 2 "register_operand" "r")
> +				    (match_operand 3 "const_immalsl_operand" ""))
> +			 (match_operand 4 "immediate_operand" ""))))]
> +  "TARGET_64BIT
> +   && ((INTVAL (operands[4]) >> INTVAL (operands[3])) == 0xffffffff)"
> +  "bstrpick.d\t%0,%2,31,0\n\talsl.d\t%0,%0,%1,%3"
> +  [(set_attr "type" "arith")
> +   (set_attr "mode" "DI")
> +   (set_attr "insn_count" "2")])

Same for this pair.

> […]
> +(define_expand "tablejump"
> +  [(set (pc)
> +	(match_operand 0 "register_operand"))
> +   (use (label_ref (match_operand 1 "")))]
> +  ""
> +{
> +  if (flag_pic)
> +      operands[0] = expand_simple_binop (Pmode, PLUS, operands[0],
> +					 gen_rtx_LABEL_REF (Pmode,
> +							    operands[1]),
> +					 NULL_RTX, 0, OPTAB_DIRECT);

Formatting nit: the last four lines should be indented by two spaces fewer.

> +  emit_jump_insn (PMODE_INSN (gen_tablejump, (operands[0], operands[1])));
> +  DONE;
> +})
> +(define_insn "sibcall_internal"
> +  [(call (mem:SI (match_operand 0 "call_insn_operand" "j,c,a,t,h"))
> +	 (match_operand 1 "" ""))]
> +  "SIBLING_CALL_P (insn)"
> +{
> +  switch (which_alternative)
> +    {
> +    case 0:
> +      return "jr\t%0";
> +    case 1:
> +      if (TARGET_CMODEL_LARGE)
> +	return "pcaddu18i\t$r12,(%%pcrel(%0+0x20000))>>18\n\t"
> +	       "jirl\t$r0,$r12,%%pcrel(%0+4)-(%%pcrel(%0+4+0x20000)>>18<<18)";
> +      else if (TARGET_CMODEL_EXTREME)
> +	return "la.local\t$r12,$r13,%0\n\tjr\t$r12";
> +      else
> +	return "b\t%0";
> +    case 2:
> +      if (TARGET_CMODEL_TINY_STATIC)
> +	return "b\t%0";
> +      else if (TARGET_CMODEL_EXTREME)
> +	return "la.global\t$r12,$r13,%0\n\tjr\t$r12";
> +      else
> +	return "la.global\t$r12,%0\n\tjr\t$r12";
> +    case 3:
> +      if (TARGET_CMODEL_EXTREME)
> +	return "la.global\t$r12,$r13,%0\n\tjr\t$r12";
> +      else
> +	return "la.global\t$r12,%0\n\tjr\t$r12";
> +    case 4:
> +      if (TARGET_CMODEL_NORMAL || TARGET_CMODEL_TINY)
> +	return "b\t%%plt(%0)";
> +      else if (TARGET_CMODEL_LARGE)
> +	return "pcaddu18i\t$r12,(%%plt(%0)+0x20000)>>18\n\t"
> +	       "jirl\t$r0,$r12,%%plt(%0)+4-((%%plt(%0)+(4+0x20000))>>18<<18)";
> +      else
> +	{
> +	  sorry ("cmodel extreme and tiny static not support plt");

“do not support PLTs”.

Can this be triggered by a certain combination of source code
and command-line options, or is it really an internal error?

> +	  return "";  /* GCC complains about may fall through.  */

sorry() isn't a fatal error, so the compiler will continue.
Returning "" is the right thing to do, but the comment makes it
sound unreachable.

Same comments for later sorry() + return pairs.

> +	}
> +    default:
> +      gcc_unreachable ();
> +    }
> +}
> +  [(set_attr "jirl" "indirect,direct,direct,direct,direct")])
> +
> +(define_expand "sibcall_value"
> +  [(parallel [(set (match_operand 0 "")
> +		   (call (match_operand 1 "")
> +			 (match_operand 2 "")))
> +	      (use (match_operand 3 ""))])]		;; next_arg_reg
> +  ""
> +{
> +  rtx target = loongarch_legitimize_call_address (XEXP (operands[1], 0));
> +
> + /*  Handle return values created by loongarch_pass_fpr_pair.  */

Formatting nit: should be two spaces before “/*” and only one after it.

> +  if (GET_CODE (operands[0]) == PARALLEL && XVECLEN (operands[0], 0) == 2)
> +    {
> +      rtx arg1 = XEXP (XVECEXP (operands[0],0, 0), 0);
> +      rtx arg2 = XEXP (XVECEXP (operands[0],0, 1), 0);
> +
> +      emit_call_insn (gen_sibcall_value_multiple_internal (arg1, target,
> +							   operands[2],
> +							   arg2));
> +    }
> +   else
> +    {
> +      /*  Handle return values created by loongarch_return_fpr_single.  */

Should only be one space after “/*” here too.

> +      if (GET_CODE (operands[0]) == PARALLEL && XVECLEN (operands[0], 0) == 1)
> +      operands[0] = XEXP (XVECEXP (operands[0], 0, 0), 0);

The last line should be indented by two spaces more.

Same three comments for call_value.

> […]
> +;;
> +;;  ....................
> +;;
> +;;	MISC.
> +;;
> +;;  ....................
> +;;
> +
> +(define_insn "nop"
> +  [(const_int 0)]
> +  ""
> +  "nop"
> +  [(set_attr "type"	"nop")
> +   (set_attr "mode"	"none")])

Formatting nit: most of the rest of the file uses a space rather than a
tab after the attribute name.  IMO a space looks nicer.  Same for some
later patterns.

> +
> +;; __builtin_loongarch_movfcsr2gr: move the FCSR into operand 0.
> +(define_insn "loongarch_movfcsr2gr"
> +  [(set (match_operand:SI 0 "register_operand" "=r")
> +    (unspec_volatile:SI [(match_operand 1 "const_uimm5_operand")]
> +    UNSPECV_MOVFCSR2GR))]

Last two lines look under-indented.

> +  "TARGET_HARD_FLOAT"
> +  "movfcsr2gr\t%0,$r%1")
> +
> […]
> +(define_insn "stack_tie<mode>"
> +  [(set (mem:BLK (scratch))
> +	(unspec:BLK [(match_operand:GPR 0 "register_operand" "r")
> +		     (match_operand:GPR 1 "register_operand" "r")]
> +		    UNSPEC_TIE))]
> +  ""
> +  ""
> +  [(set_attr "length" "0")]
> +)

Using [(set_attr "type "ghost")] should be slightly better for
scheduling than setting the length directly.

> +
> +(define_insn "gpr_restore_return"
> +  [(return)
> +   (use (match_operand 0 "pmode_register_operand" ""))
> +   (const_int 0)]
> +  ""
> +  "")

Might be worth adding a comment here.  Why does this form of return expand
to no code?

> +
> +(define_split
> +  [(match_operand 0 "small_data_pattern")]
> +  "reload_completed"
> +  [(match_dup 0)]
> +  { operands[0] = loongarch_rewrite_small_data (operands[0]); })
> +
> +
> +;; Match paired HI/SI/SF/DFmode load/stores.
> +(define_insn "*join2_load_store<JOIN_MODE:mode>"
> +  [(set (match_operand:JOIN_MODE 0 "nonimmediate_operand"
> +  "=r,f,m,m,r,ZC,r,k,f,k")
> +	(match_operand:JOIN_MODE 1 "nonimmediate_operand" "m,m,r,f,ZC,r,k,r,k,f"))
> +   (set (match_operand:JOIN_MODE 2 "nonimmediate_operand"
> +   "=r,f,m,m,r,ZC,r,k,f,k")
> +	(match_operand:JOIN_MODE 3 "nonimmediate_operand" "m,m,r,f,ZC,r,k,r,k,f"))]
> +  "reload_completed"
> +  {
> +    bool load_p = (which_alternative == 0 || which_alternative == 1);
> +    /* Reg-renaming pass reuses base register if it is dead after bonded loads.
> +       Hardware does not bond those loads, even when they are consecutive.
> +       However, order of the loads need to be checked for correctness.  */
> +    if (!load_p || !reg_overlap_mentioned_p (operands[0], operands[1]))
> +      {

I'm not sure I understand how these patterns work, but it looks like the
condition above is trying to work around a later change to the insn by
regrename, after peephole2 has checked loongarch_load_store_bonding_p.
If so, you should be able to avoid that by marking the destinations of
the loads as earlyclobbers, using "&r" instead of "r" for the first
alternative.  regrename should then preserve the conditions that
loongarch_load_store_bonding_p checked earlier.

Same for the other patterns.

> +	output_asm_insn (loongarch_output_move (operands[0], operands[1]),
> +			 operands);
> +	output_asm_insn (loongarch_output_move (operands[2], operands[3]),
> +			 &operands[2]);
> +      }
> +    else
> +      {
> +	output_asm_insn (loongarch_output_move (operands[2], operands[3]),
> +			 &operands[2]);
> +	output_asm_insn (loongarch_output_move (operands[0], operands[1]),
> +			 operands);
> +      }
> +    return "";
> +  }
> +  [(set_attr "move_type"
> +  "load,fpload,store,fpstore,load,store,load,store,fpload,fpstore")
> +   (set_attr "insn_count" "2,2,2,2,2,2,2,2,2,2")])
> […]
> +;; This is used for indexing into vectors, and hence only accepts const_int.
> +(define_predicate "const_0_or_1_operand"
> +  (and (match_code "const_int")
> +       (match_test "IN_RANGE (INTVAL (op), 0, 1)")))

It doesn't look like this is used.  It'd be good to check the other
predicates to see if any of them can be removed.  In particular…

> […]
> +(define_predicate "const_8_to_15_operand"
> +  (and (match_code "const_int")
> +       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
> +
> +(define_predicate "const_16_to_31_operand"
> +  (and (match_code "const_int")
> +       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))

…these two don't seem to do what their name suggests.

> […]
> +(define_predicate "muldiv_target_operand"
> +		(match_operand 0 "register_operand"))

This one also seems unused.  IMO using register_operand would be clearer.

> […]
> +(define_predicate "is_const_call_local_symbol"
> +  (and (match_operand 0 "const_call_insn_operand")
> +       (ior (match_test "loongarch_global_symbol_p (op) == 0")
> +       (match_test "loongarch_symbol_binds_local_p (op) != 0"))

The indentation looks misleading here: the last line is another
operand of the ior.

Thanks,
Richard

> +       (match_test "CONSTANT_P (op)")))
> […]
Lulu Cheng March 7, 2022, 3:59 a.m. UTC | #2
Hi,  Richard:

   Thanks for your review.
   We will revise it as soon as possible and submit it in the next version.

在 2022/3/7 上午12:16, Richard Sandiford 写道:

> Hi,
>
> Some comments below, but otherwise it looks good to me.
>
> xuchenghua@loongson.cn writes:
>> […]
>> +(define_memory_constraint "k"
>> +  "A memory operand whose address is formed by a base register and (optionally scaled)
>> +   index register."
>> +  (and (match_code "mem")
>> +       (not (match_test "loongarch_14bit_shifted_offset_address_p (XEXP (op, 0), mode)"))
>> +       (not (match_test "loongarch_12bit_offset_address_p (XEXP (op, 0), mode)"))))
> It's not really safe to test MEM addresses using only negative conditions.
> There needs to be a positive condition too, even if it's only:
>
>         (match_test "memory_address_addr_space_p (GET_MODE (op), XEXP (op, 0),
> 						 MEM_ADDR_SPACE (op))")))
>
> (from common.md).
>
>> […]
>> +(define_constraint "v"
>> +  "A nsigned 64-bit constant and low 44-bit is zero (for logic instructions)."
> typo
>
>> […]
>> +(define_memory_constraint "ZB"
>> +  "@internal
>> +  An address that is held in a general-purpose register.
>> +  The offset is zero"
>> +  (and (match_code "mem")
>> +       (match_test "GET_CODE (XEXP (op,0)) == REG")))
> It'd be good to use things like REG_P in new code.
>
> Formatting nit: should be a space after “op,”.
>
>> […]
>> +;; Pipeline descriptions.
>> +;;
>> +;; generic.md provides a fallback for processors without a specific
>> +;; pipeline description.  It is derived from the old define_function_unit
>> +;; version and uses the "alu" and "imuldiv" units declared below.
>> +;;
>> +;; Some of the processor-specific files are also derived from old
>> +;; define_function_unit descriptions and simply override the parts of
>> +;; generic.md that don't apply.  The other processor-specific files
>> +;; are self-contained.
> I don't think these last two paragraphs apply to the new code.
> The MIPS generic.md was converted from a much older pipeline
> description format.  The conversion meant the older processor
> descriptions weren't self-contained and relied on a mixture
> of processor-specific things (in their own file) and generic
> things (in this file).
>
> New processor descriptions should be self-contained as far
> as possible, in terms of not sharing cpu units with other
> processor descriptions.
>
>> +(define_automaton "alu,imuldiv")
>> +
>> +(define_cpu_unit "alu" "alu")
>> +(define_cpu_unit "imuldiv" "imuldiv")
>> +
>> +;; Ghost instructions produce no real code.
>> +;; They exist purely to express an effect on dataflow.
>> +(define_insn_reservation "ghost" 0
>> +  (eq_attr "type" "ghost")
>> +  "nothing")
>> +
>> +;; This file is derived from the old define_function_unit description.
>> +;; Each reservation can be overridden on a processor-by-processor basis.
> Same for this last comment.
>
> The "ghost" reservation is inherently sharable because it doesn't
> use any CPU units.  But for a new port, I think the other reservations
> in this file should be conditional on a particular -mtune.
>
>> […]
>> diff --git a/gcc/config/loongarch/la464.md b/gcc/config/loongarch/la464.md
>> new file mode 100644
>> index 00000000000..ae3808b51bb
>> --- /dev/null
>> +++ b/gcc/config/loongarch/la464.md
>> @@ -0,0 +1,132 @@
>> +;; Pipeline model for LoongArch LA464 cores.
>> +
>> +;; Copyright (C) 2021-2022 Free Software Foundation, Inc.
>> +;; Contributed by Loongson Ltd.
>> +
>> +;; This file is part of GCC.
>> +;;
>> +;; GCC is free software; you can redistribute it and/or modify it
>> +;; under the terms of the GNU General Public License as published
>> +;; by the Free Software Foundation; either version 3, or (at your
>> +;; option) any later version.
>> +;;
>> +;; GCC is distributed in the hope that it will be useful, but WITHOUT
>> +;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
>> +;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
>> +;; License for more details.
>> +;;
>> +;; You should have received a copy of the GNU General Public License
>> +;; along with GCC; see the file COPYING3.  If not see
>> +;; <http://www.gnu.org/licenses/>.
>> +
>> +;; Uncomment the following line to output automata for debugging.
>> +;; (automata_option "v")
>> +
>> +;; Automaton for integer instructions.
>> +(define_automaton "la464_a_alu")
>> +
>> +;; Automaton for floating-point instructions.
>> +(define_automaton "la464_a_falu")
>> +
>> +;; Automaton for memory operations.
>> +(define_automaton "la464_a_mem")
>> +
>> +;; Describe the resources.
>> +
>> +(define_cpu_unit "la464_alu1" "la464_a_alu")
>> +(define_cpu_unit "la464_alu2" "la464_a_alu")
>> +(define_cpu_unit "la464_mem1" "la464_a_mem")
>> +(define_cpu_unit "la464_mem2" "la464_a_mem")
>> +(define_cpu_unit "la464_falu1" "la464_a_falu")
>> +(define_cpu_unit "la464_falu2" "la464_a_falu")
>> +
>> +;; Describe instruction reservations.
>> +
>> +(define_insn_reservation "la464_arith" 1
>> +  (and (match_test "TARGET_ARCH_LA464")
> Normally scheduling should be determined by -mtune (with a default
> -mtune chosen by -march when no explicit -mtune is given).  So I was
> surprised to see this testing TARGET_ARCH_* instead of TARGET_TUNE_*.
>
>> +       (eq_attr "type" "arith,clz,const,logical,
>> +			move,nop,shift,signext,slt"))
>> +  "la464_alu1 | la464_alu2")
>> […]
>> +;; Main data type used by the insn
>> +(define_attr "mode" "unknown,none,QI,HI,SI,DI,TI,OI,SF,DF,TF,FCC"
>> +  (const_string "unknown"))
> Do any patterns use mode==OI?  Reason for asking is that:
>
>> […]
>> +;; True if the main data type is four times of the size of a word.
>> +(define_attr "qword_mode" "no,yes"
>> +  (cond [(and (eq_attr "mode" "TI,TF")
>> +	      (not (match_test "TARGET_64BIT")))
>> +	 (const_string "yes")]
>> +	(const_string "no")))
>> +
>> +;; True if the main data type is eight times of the size of a word.
>> +(define_attr "oword_mode" "no,yes"
>> +  (cond [(and (eq_attr "mode" "OI")
>> +	      (not (match_test "TARGET_64BIT")))
>> +	 (const_string "yes")]
>> +	(const_string "no")))
> …it seemed inconsistent to treat OI is 8 words on 32-bit targets
> but not as 4 words (qword_mode) on 64-bit targets.
>
> It looks like oword_mode and OI might not be used, in which case
> it's probably easiest to remove them for now.
>
>> +(define_attr "compression" "none,all"
>> +  (const_string "none"))
> Does anything use this, or is it a holdover from MIPS?  I could see some
> definitions of the attribute later in the file, but there didn't seem
> to be any users.
>
>> […]
>> +;; This mode iterator allows the QI HI SI and DI extension patterns to be
> incomplete sentence.
>
>> +(define_mode_iterator QHWD [QI HI SI (DI "TARGET_64BIT")])
>> +
>> +;; Iterator for hardware-supported floating-point modes.
>> +(define_mode_iterator ANYF [(SF "TARGET_HARD_FLOAT")
>> +			    (DF "TARGET_DOUBLE_FLOAT")])
>> +
>> +;; A floating-point mode for which moves involving FPRs may need to be split.
> Probably s/floating-point //, given that the iterator includes DI.
>
>> +(define_mode_iterator SPLITF
>> +  [(DF "!TARGET_64BIT && TARGET_DOUBLE_FLOAT")
>> +   (DI "!TARGET_64BIT && TARGET_DOUBLE_FLOAT")
>> +   (TF "TARGET_64BIT && TARGET_DOUBLE_FLOAT")])
>> +
>> +;; In GPR templates, a string like "mul.<d>" will expand to "mul" in the
>> +;; 32-bit "mul.w" and "mul.d" in the 64-bit version.
> Maybe:
>
> ;; In GPR templates, a string like "mul.<d>" will expand to "mul.w" in the
> ;; 32-bit version and "mul.d" in the 64-bit version.
>
>> +(define_mode_attr d [(SI "w") (DI "d")])
>> +
>> +;; This attribute gives the length suffix for a load or store instruction.
>> +;; The same suffixes work for zero and sign extensions.
>> +(define_mode_attr size [(QI "b") (HI "h") (SI "w") (DI "d")])
>> +(define_mode_attr SIZE [(QI "B") (HI "H") (SI "W") (DI "D")])
>> +
>> +;; This attributes gives the mode mask of a SHORT.
> s/attributes/attribute/
>
>> +(define_mode_attr mask [(QI "0x00ff") (HI "0xffff")])
>> +
>> +;; This attributes gives the size (bits) of a SHORT.
> Maybe:
>
> ;; This attribute gives the number of bits in a SHORT minus one.
>
> since it's 7,15 rather than 8,16.
>
>> +(define_mode_attr qi_hi [(QI "7") (HI "15")])
> Just a suggestion, but it might be worth renaming this.  qi_hi only
> describes the range of inputs rather than what the attribute is.
>
>> […]
>> +(define_insn "*addsi3_extended"
>> +  [(set (match_operand:DI 0 "register_operand" "=r,r")
>> +	(sign_extend:DI
>> +	     (plus:SI (match_operand:SI 1 "register_operand" "r,r")
>> +		      (match_operand:SI 2 "arith_operand" "r,I"))))]
>> +  "TARGET_64BIT"
>> +  "add%i2.w\t%0,%1,%2"
>> +  [(set_attr "alu_type" "add")
>> +   (set_attr "mode" "SI")])
>> +
>> +(define_insn "*addsi3_extended2"
>> +  [(set (match_operand:DI 0 "register_operand" "=r,r")
>> +	(sign_extend:DI
>> +	  (subreg:SI (plus:DI (match_operand:DI 1 "register_operand" "r,r")
>> +			      (match_operand:DI 2 "arith_operand"    "r,I"))
>> +		     0)))]
>> +  "TARGET_64BIT"
>> +  "add%i2.w\t%0,%1,%2"
>> +  [(set_attr "alu_type" "add")
>> +   (set_attr "mode" "SI")])
> This is really a question for part 5, but it affects this part too:
>
> I notice the port defines TRULY_NOOP_TRUNCATION to false for DI->SI,
> like MIPS does.  Are you sure that's the right trade-off?  It had to
> be defined that way for MIPS because 32-bit MIPS instructions gave
> undefined results if the input operands weren't in sign-extended form.
> For example a 32-bit add with 64-bit register contents:
>
>     0x(ffffffff_)fffffffe
>   + 0x(00000000_)00000001
>
> was guaranteed to give -1 (sign-extended) but:
>
>     0x(00000000_)fffffffe
>   + 0x(00000000_)00000001
>
> gave an undefined result.
>
> Does Loongson have the same restriction, or do the 32-bit instructions
> simply ignore the upper 32 bits?  If Loongson ignores the upper bits
> then it would probably be better not to define TRULY_NOOP_TRUNCATION.
>
> If you do that, the subreg pattern above shouldn't be needed; the
> subreg should get folded away by target-independent code.
>
>> +;;
>> +;;  ....................
>> +;;
>> +;;	MULTIPLICATION
>> +;;
>> +;;  ....................
>> +;;
>> +
>> +(define_insn "mul<mode>3"
>> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
>> +	(mult:ANYF (match_operand:ANYF 1 "register_operand" "f")
>> +		   (match_operand:ANYF 2 "register_operand" "f")))]
>> +  "TARGET_HARD_FLOAT"
> Not a big deal, just noticed that some ANYF patterns (like this one)
> add an explicit TARGET_HARD_FLOAT on top of the ANYF conditions:
>
> (define_mode_iterator ANYF [(SF "TARGET_HARD_FLOAT")
> 			    (DF "TARGET_DOUBLE_FLOAT")])
>
> while other patterns don't.  Both are correct of course, but it might
> be good to use one style throughout the file.  At first, when I saw
> the pattern above, I thought the TARGET_HARD_FLOAT was missing from
> the earlier patterns.
>
>> […]
>> +(define_insn "*div<mode>3"
>> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
>> +	(div:ANYF (match_operand:ANYF 1 "register_operand" "f")
>> +		  (match_operand:ANYF 2 "register_operand" "f")))]
>> +  "TARGET_HARD_FLOAT"
>> +  "fdiv.<fmt>\t%0,%1,%2"
>> +  [(set_attr "type" "fdiv")
>> +   (set_attr "mode" "<UNITMODE>")
>> +   (set_attr "insn_count" "1")])
>> +
>> +;; In 3A5000, the reciprocal operation is the same as the division operation.
>> +
>> +(define_insn "*recip<mode>3"
>> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
>> +	(div:ANYF (match_operand:ANYF 1 "const_1_operand" "")
>> +		  (match_operand:ANYF 2 "register_operand" "f")))]
>> +  "TARGET_HARD_FLOAT"
>> +  "frecip.<fmt>\t%0,%2"
>> +  [(set_attr "type" "frdiv")
>> +   (set_attr "mode" "<UNITMODE>")
>> +   (set_attr "insn_count" "1")])
> Very minor, but these insn_counts seem redundant.  (MIPS needed them
> due to the SB1 errata workaround.)
>
>> […]
>> +;; Floating point multiply accumulate instructions.
>> +
>> +;; a * b + c
>> +(define_expand "fma<mode>4"
>> +  [(set (match_operand:ANYF 0 "register_operand")
>> +	(fma:ANYF (match_operand:ANYF 1 "register_operand")
>> +		  (match_operand:ANYF 2 "register_operand")
>> +		  (match_operand:ANYF 3 "register_operand")))]
>> +  "TARGET_HARD_FLOAT")
>> +
>> +(define_insn "*fma<mode>4_madd4"
>> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
>> +	(fma:ANYF (match_operand:ANYF 1 "register_operand" "f")
>> +		  (match_operand:ANYF 2 "register_operand" "f")
>> +		  (match_operand:ANYF 3 "register_operand" "f")))]
>> +  "TARGET_HARD_FLOAT"
>> +  "fmadd.<fmt>\t%0,%1,%2,%3"
>> +  [(set_attr "type" "fmadd")
>> +   (set_attr "mode" "<UNITMODE>")])
> This pair of patterns could be a single define_insn, like for fms.
>
>> […]
>> +;; Integer truncation patterns.  Truncating SImode values to smaller
>> +;; modes is a no-op, as it is for most other GCC ports.  Truncating
>> +;; DImode values to SImode is not a no-op for TARGET_64BIT since we
>> +;; need to make sure that the lower 32 bits are properly sign-extended
>> +;; (see TARGET_TRULY_NOOP_TRUNCATION).  Truncating DImode values into modes
>> +;; smaller than SImode is equivalent to two separate truncations:
>> +;;
>> +;;			  A       B
>> +;;    DI ---> HI  ==  DI ---> SI ---> HI
>> +;;    DI ---> QI  ==  DI ---> SI ---> QI
>> +;;
>> +;; Step A needs a real instruction but step B does not.
>> +
>> +(define_insn "truncdi<mode>2"
>> +  [(set (match_operand:SUBDI 0 "nonimmediate_operand" "=r,m,k")
>> +	(truncate:SUBDI (match_operand:DI 1 "register_operand" "r,r,r")))]
>> +  "TARGET_64BIT"
>> +  "@
>> +    slli.w\t%0,%1,0
>> +    st.<size>\t%1,%0
>> +    stx.<size>\t%1,%0"
>> +  [(set_attr "move_type" "sll0,store,store")
>> +   (set_attr "mode" "SI")])
> Following on from the above, this pattern wouldn't be needed
> if TRULY_NOOP_TRUNCATION can be left undefined.
>
>> […]
>> +;; Combiner patterns to optimize shift/truncate combinations.
>> +
>> +(define_insn "*ashr_trunc<mode>"
>> +  [(set (match_operand:SUBDI 0 "register_operand" "=r")
>> +	(truncate:SUBDI
>> +	  (ashiftrt:DI (match_operand:DI 1 "register_operand" "r")
>> +		       (match_operand:DI 2 "const_arith_operand" ""))))]
>> +  "TARGET_64BIT && IN_RANGE (INTVAL (operands[2]), 32, 63)"
>> +  "srai.d\t%0,%1,%2"
>> +  [(set_attr "type" "shift")
>> +   (set_attr "mode" "<MODE>")])
>> +
>> +(define_insn "*lshr32_trunc<mode>"
>> +  [(set (match_operand:SUBDI 0 "register_operand" "=r")
>> +	(truncate:SUBDI
>> +	  (lshiftrt:DI (match_operand:DI 1 "register_operand" "r")
>> +		       (const_int 32))))]
>> +  "TARGET_64BIT"
>> +  "srai.d\t%0,%1,32"
>> +  [(set_attr "type" "shift")
>> +   (set_attr "mode" "<MODE>")])
> Same for these.
>
>> +
>> +;;
>> +;;  ....................
>> +;;
>> +;;	ZERO EXTENSION
>> +;;
>> +;;  ....................
>> +
>> +(define_insn "zero_extendsidi2"
>> +  [(set (match_operand:DI 0 "register_operand" "=r,r,r,r")
>> +	(zero_extend:DI (match_operand:SI 1 "nonimmediate_operand" "r,ZC,m,k")))]
>> +  "TARGET_64BIT"
>> +  "@
>> +   bstrpick.d\t%0,%1,31,0
>> +   ldptr.w\t%0,%1\n\tlu32i.d\t%0,0
> FWIW, splitting this after reload would give more scheduling freedom,
> but it's not necessary to do that for the submission.
>
>> +   ld.wu\t%0,%1
>> +   ldx.wu\t%0,%1"
>> +  [(set_attr "move_type" "arith,load,load,load")
>> +   (set_attr "mode" "DI")
>> +   (set_attr "insn_count" "1,2,1,1")])
>> +
>> […]
>> +;; Combiner patterns to optimize truncate/zero_extend combinations.
>> +
>> +(define_insn "*zero_extend<GPR:mode>_trunc<SHORT:mode>"
>> +  [(set (match_operand:GPR 0 "register_operand" "=r")
>> +	(zero_extend:GPR
>> +	    (truncate:SHORT (match_operand:DI 1 "register_operand" "r"))))]
>> +  "TARGET_64BIT"
>> +  "bstrpick.w\t%0,%1,<SHORT:qi_hi>,0"
>> +  [(set_attr "move_type" "pick_ins")
>> +   (set_attr "mode" "<GPR:MODE>")])
>> +
>> +(define_insn "*zero_extendhi_truncqi"
>> +  [(set (match_operand:HI 0 "register_operand" "=r")
>> +	(zero_extend:HI
>> +	    (truncate:QI (match_operand:DI 1 "register_operand" "r"))))]
>> +  "TARGET_64BIT"
>> +  "andi\t%0,%1,0xff"
>> +  [(set_attr "alu_type" "and")
>> +   (set_attr "mode" "HI")])
>> +
>> +;;
>> +;;  ....................
>> +;;
>> +;;	SIGN EXTENSION
>> +;;
>> +;;  ....................
>> +
>> +;; Extension insns.
>> +;; Those for integer source operand are ordered widest source type first.
>> +
>> +;; When TARGET_64BIT, all SImode integer should already be in sign-extended
>> +;; form (see TARGET_TRULY_NOOP_TRUNCATION and truncdisi2).  We can therefore
>> +;; get rid of register->register instructions if we constrain the source to
>> +;; be in the same register as the destination.
>> +;;
>> +;; Only the pre-reload scheduler sees the type of the register alternatives;
>> +;; we split them into nothing before the post-reload scheduler runs.
>> +;; These alternatives therefore have type "move" in order to reflect
>> +;; what happens if the two pre-reload operands cannot be tied, and are
>> +;; instead allocated two separate GPRs.
>> +(define_insn_and_split "extendsidi2"
>> +  [(set (match_operand:DI 0 "register_operand" "=r,r,r,r")
>> +	(sign_extend:DI (match_operand:SI 1 "nonimmediate_operand" "0,ZC,m,k")))]
>> +  "TARGET_64BIT"
>> +  "@
>> +   #
>> +   ldptr.w\t%0,%1
>> +   ld.w\t%0,%1
>> +   ldx.w\t%0,%1"
>> +  "&& reload_completed && register_operand (operands[1], VOIDmode)"
>> +  [(const_int 0)]
>> +{
>> +  emit_note (NOTE_INSN_DELETED);
>> +  DONE;
>> +}
>> +  [(set_attr "move_type" "move,load,load,load")
>> +   (set_attr "mode" "DI")])
> The three patterns above would also look different without
> the definition of TRULY_NOOP_TRUNCATION.
>
>> […]
>> +(define_insn "*extenddi_truncate<mode>"
>> +  [(set (match_operand:DI 0 "register_operand" "=r")
>> +	(sign_extend:DI
>> +	    (truncate:SHORT (match_operand:DI 1 "register_operand" "r"))))]
>> +  "TARGET_64BIT"
>> +  "ext.w.<size>\t%0,%1"
>> +  [(set_attr "move_type" "signext")
>> +   (set_attr "mode" "DI")])
>> +
>> +(define_insn "*extendsi_truncate<mode>"
>> +  [(set (match_operand:SI 0 "register_operand" "=r")
>> +	(sign_extend:SI
>> +	    (truncate:SHORT (match_operand:DI 1 "register_operand" "r"))))]
>> +  "TARGET_64BIT"
>> +  "ext.w.<size>\t%0,%1"
>> +  [(set_attr "move_type" "signext")
>> +   (set_attr "mode" "SI")])
>> +
>> +(define_insn "*extendhi_truncateqi"
>> +  [(set (match_operand:HI 0 "register_operand" "=r")
>> +	(sign_extend:HI
>> +	    (truncate:QI (match_operand:DI 1 "register_operand" "r"))))]
>> +  "TARGET_64BIT"
>> +  "ext.w.b\t%0,%1"
>> +  [(set_attr "move_type" "signext")
>> +   (set_attr "mode" "SI")])
> Same for these.
>
>> +
>> +(define_insn "extendsfdf2"
>> +  [(set (match_operand:DF 0 "register_operand" "=f")
>> +	(float_extend:DF (match_operand:SF 1 "register_operand" "f")))]
>> +  "TARGET_DOUBLE_FLOAT"
>> +  "fcvt.d.s\t%0,%1"
>> +  [(set_attr "type" "fcvt")
>> +   (set_attr "cnv_mode"	"S2D")
>> +   (set_attr "mode" "DF")])
> Very minor, but it's probably better to use a space rather than a tab
> after "cnv_mode", for consistency with the other attributes.  Same in
> the rest of the file.
>
>> +;; floating point value by converting to value to an unsigned integer
> typo, maybe:
>
> ;; Convert a floating-point value to an unsigned integer.
>
>> +
>> +(define_expand "fixuns_truncdfsi2"
>> +  [(set (match_operand:SI 0 "register_operand")
>> +	(unsigned_fix:SI (match_operand:DF 1 "register_operand")))]
>> +  "TARGET_DOUBLE_FLOAT"
>> +{
>> +  rtx reg1 = gen_reg_rtx (DFmode);
>> +  rtx reg2 = gen_reg_rtx (DFmode);
>> +  rtx reg3 = gen_reg_rtx (SImode);
>> +  rtx_code_label *label1 = gen_label_rtx ();
>> +  rtx_code_label *label2 = gen_label_rtx ();
>> +  rtx test;
>> +  REAL_VALUE_TYPE offset;
>> +
>> +  real_2expN (&offset, 31, DFmode);
>> +
>> +  if (reg1)		      /* Turn off complaints about unreached code.  */
>> +    {
> I think we can drop this workaround.  IIRC it was needed for the
> MIPS IRIX compilers, or maybe it was some ancient version of GCC.
>
>> +      loongarch_emit_move (reg1,
>> +			   const_double_from_real_value (offset, DFmode));
>> +      do_pending_stack_adjust ();
>> +
>> +      test = gen_rtx_GE (VOIDmode, operands[1], reg1);
>> +      emit_jump_insn (gen_cbranchdf4 (test, operands[1], reg1, label1));
>> +
>> +      emit_insn (gen_fix_truncdfsi2 (operands[0], operands[1]));
>> +      emit_jump_insn (gen_rtx_SET (pc_rtx,
>> +				   gen_rtx_LABEL_REF (VOIDmode, label2)));
>> +      emit_barrier ();
>> +
>> +      emit_label (label1);
>> +      loongarch_emit_move (reg2, gen_rtx_MINUS (DFmode, operands[1], reg1));
>> +      loongarch_emit_move (reg3, GEN_INT (trunc_int_for_mode
>> +				     (BITMASK_HIGH, SImode)));
>> +
>> +      emit_insn (gen_fix_truncdfsi2 (operands[0], reg2));
>> +      emit_insn (gen_iorsi3 (operands[0], operands[0], reg3));
>> +
>> +      emit_label (label2);
>> +
>> +      /* Allow REG_NOTES to be set on last insn (labels don't have enough
>> +	 fields, and can't be used for REG_NOTES anyway).  */
>> +      emit_use (stack_pointer_rtx);
>> +      DONE;
>> +    }
>> +})
>> […]
>> +;; Allow combine to split complex const_int load sequences, using operand 2
>> +;; to store the intermediate results.  See move_operand for details.
>> +(define_split
>> +  [(set (match_operand:GPR 0 "register_operand")
>> +	(match_operand:GPR 1 "splittable_const_int_operand"))
>> +   (clobber (match_operand:GPR 2 "register_operand"))]
>> +  ""
>> +  [(const_int 0)]
>> +{
>> +  loongarch_move_integer (operands[2], operands[0], INTVAL (operands[1]));
>> +  DONE;
>> +})
> Does this define_split trigger?  I couldn't see an associated
> define_insn that would provide the clobber.
>
>> +(define_insn "*movsi_internal"
>> +  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r,w,*f,*f,*r,*m,*r,*z")
>> +	(match_operand:SI 1 "move_operand" "r,Yd,w,rJ,*r*J,*m,*f,*f,*z,*r"))]
>> +  "(register_operand (operands[0], SImode)
>> +       || reg_or_0_operand (operands[1], SImode))"
> Minor formatting nit: too much indentantion on the line above.  Same for
> the other moves.
>
>> […]
>> +;; Conditional move instructions.
>> +
>> +(define_insn "*sel<code><GPR:mode>_using_<GPR2:mode>"
>> +  [(set (match_operand:GPR 0 "register_operand" "=r,r")
>> +	(if_then_else:GPR
>> +	 (equality_op:GPR2 (match_operand:GPR2 1 "register_operand" "r,r")
>> +			   (const_int 0))
>> +	 (match_operand:GPR 2 "reg_or_0_operand" "r,J")
>> +	 (match_operand:GPR 3 "reg_or_0_operand" "J,r")))]
>> +  "register_operand (operands[2], <GPR:MODE>mode)
>> +       != register_operand (operands[3], <GPR:MODE>mode)"
> Same here.
>
>> +  "@
>> +   <sel>\t%0,%2,%1
>> +   <selinv>\t%0,%3,%1"
>> +  [(set_attr "type" "condmove")
>> +   (set_attr "mode" "<GPR:MODE>")])
>> +
>> +;; sel.fmt copies the 3rd argument when the 1st is non-zero and the 2nd
> s/sel.fmt/fsel/
>
>> +;; argument if the 1st is zero.  This means operand 2 and 3 are
>> +;; inverted in the instruction.
>> +
>> +(define_insn "*sel<mode>"
>> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
>> +	(if_then_else:ANYF
>> +	 (ne:FCC (match_operand:FCC 1 "register_operand" "z")
>> +		 (const_int 0))
>> +	 (match_operand:ANYF 2 "reg_or_0_operand" "f")
>> +	 (match_operand:ANYF 3 "reg_or_0_operand" "f")))]
>> +  "TARGET_HARD_FLOAT"
>> +  "fsel\t%0,%3,%2,%1"
>> +  [(set_attr "type" "condmove")
>> +   (set_attr "mode" "<ANYF:MODE>")])
>> […]
>> +(define_insn "lu52i_d"
>> +  [(set (match_operand:DI 0 "register_operand" "=r")
>> +	(ior:DI
>> +	  (and:DI (match_operand:DI 1 "register_operand" "r")
>> +		  (match_operand 2 "lu52i_mask_operand"))
>> +	  (match_operand 3 "const_lu52i_operand" "v")))]
>> +    "TARGET_64BIT"
>> +    "lu52i.d\t%0,%1,%X3>>52"
>> +    [(set_attr "type" "arith")
>> +     (set_attr "mode" "DI")])
> Formatting nit: too much indentation from "TARGET_64BIT" onwards.
>
>> +
>> +;; Convert floating-point numbers to integers
>> +(define_insn "frint_<fmt>"
>> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
>> +	(unspec:ANYF [(match_operand:ANYF 1 "register_operand" "f")]
>> +		      UNSPEC_FRINT))]
>> +  "TARGET_HARD_FLOAT"
>> +  "frint.<fmt>\t%0,%1"
>> +  [(set_attr "type" "fcvt")
>> +   (set_attr "mode" "<MODE>")])
>> +
>> +;; LoongArch supports loading and storing a floating point register from
>> +;; the sum of two general-purpose registers.  We use two versions for each of
>> +;; these four instructions: one where the two general-purpose registers are
>> +;; SImode, and one where they are DImode.  This is because general-purpose
>> +;; registers will be in SImode when they hold 32-bit values, but,
>> +;; since the 32-bit values are always sign extended, the f{ld/st}x.{s/d}
>> +;; instructions will still work correctly.
>> +
>> +;; ??? Perhaps it would be better to support these instructions by
>> +;; modifying TARGET_LEGITIMATE_ADDRESS_P and friends.  However, since
>> +;; these instructions can only be used to load and store floating
>> +;; point registers, that would probably cause trouble in reload.
> Does this comment apply to Loongson?  It looks from:
>
>    +;; Similarly for LoongArch indexed GPR loads and stores.
>    +(define_mode_attr loadx [(QI "ldx.b")
>    +			 (HI "ldx.h")
>    +			 (SI "ldx.w")
>    +			 (DI "ldx.d")])
>    +(define_mode_attr storex [(QI "stx.b")
>    +			  (HI "stx.h")
>    +			  (SI "stx.w")
>    +			  (DI "stx.d")])
>
> like Loongson has a full set of indexed loads and stores, so supporting
> indexed addresses in TARGET_LEGITIMATE_ADDRESS_P should work.
>
> Also, the MIPS comment predates LRA, which is better than old reload
> at handling irregular memory address requirements.  Accepting indexed
> addresses in TARGET_LEGITIMATE_ADDRESS_P would allow ivopts to optimise
> the code better.
>
>> […]
>> +;; Expand in-line code to clear the instruction cache between operand[0] and
>> +;; operand[1].
>> +(define_expand "clear_cache"
>> +  [(match_operand 0 "pmode_register_operand")
>> +   (match_operand 1 "pmode_register_operand")]
>> +  ""
>> +  "
>> +{
>> +  emit_insn (gen_ibar (const0_rtx));
>> +  DONE;
>> +}")
> Minor nit: the quotes before { and after } are unnecessary.
>
>> […]
>> +(define_insn "asrtle_d"
>> +	[(unspec_volatile:DI [(match_operand:DI 0 "register_operand" "r")
>> +			      (match_operand:DI 1 "register_operand" "r")]
>> +			      UNSPECV_ASRTLE_D)]
>> +  "TARGET_64BIT"
>> +  "asrtle.d\t%0,%1"
>> +  [(set_attr "type" "load")
>> +   (set_attr "mode" "DI")])
>> +
>> +(define_insn "asrtgt_d"
>> +	[(unspec_volatile:DI [(match_operand:DI 0 "register_operand" "r")
>> +			      (match_operand:DI 1 "register_operand" "r")]
>> +			      UNSPECV_ASRTGT_D)]
>> +  "TARGET_64BIT"
>> +  "asrtgt.d\t%0,%1"
>> +  [(set_attr "type" "load")
>> +   (set_attr "mode" "DI")])
> Formatting: the unspec_volatile pattern is indented too far in both cases.
>
>> […]
>> +;; The following templates were added to generate "bstrpick.d + alsl.d"
>> +;; instruction pairs.
>> +;; It is required that the values of const_immalsl_operand and
>> +;; immediate_operand must have the following correspondence:
>> +;;
>> +;; (immediate_operand >> const_immalsl_operand) == 0xffffffff
>> +
>> +(define_insn "zero_extend_ashift1"
>> +  [(set (match_operand:DI 0 "register_operand" "=r")
>> +	(and:DI (ashift:DI (subreg:DI (match_operand:SI 1 "register_operand" "r") 0)
>> +			   (match_operand 2 "const_immalsl_operand" ""))
>> +		(match_operand 3 "immediate_operand" "")))]
>> +  "TARGET_64BIT
>> +   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)"
>> +  "bstrpick.d\t%0,%1,31,0\n\talsl.d\t%0,%0,$r0,%2"
>> +  [(set_attr "type" "arith")
>> +   (set_attr "mode" "DI")
>> +   (set_attr "insn_count" "2")])
> Without the TRULY_NOOP_TRUNCATION definition, this pattern would be
> redundant with…
>
>> +(define_insn "zero_extend_ashift2"
>> +  [(set (match_operand:DI 0 "register_operand" "=r")
>> +	(and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r")
>> +			   (match_operand 2 "const_immalsl_operand" ""))
>> +		(match_operand 3 "immediate_operand" "")))]
>> +  "TARGET_64BIT
>> +   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)"
>> +  "bstrpick.d\t%0,%1,31,0\n\talsl.d\t%0,%0,$r0,%2"
>> +  [(set_attr "type" "arith")
>> +   (set_attr "mode" "DI")
>> +   (set_attr "insn_count" "2")])
> …this one.  I'm surprised it isn't already TBH.  Doesn't the subreg match:
>
>    (match_operand:DI 1 "register_operand" "r")
>
> ?
>
>> +
>> +(define_insn "alsl_paired1"
>> +  [(set (match_operand:DI 0 "register_operand" "=&r")
>> +	(plus:DI (and:DI (ashift:DI (subreg:DI (match_operand:SI 1 "register_operand" "r") 0)
>> +				    (match_operand 2 "const_immalsl_operand" ""))
>> +			 (match_operand 3 "immediate_operand" ""))
>> +		 (match_operand:DI 4 "register_operand" "r")))]
>> +  "TARGET_64BIT
>> +   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)"
>> +  "bstrpick.d\t%0,%1,31,0\n\talsl.d\t%0,%0,%4,%2"
>> +  [(set_attr "type" "arith")
>> +  (set_attr "mode" "DI")
>> +  (set_attr "insn_count" "2")])
>> +
>> +(define_insn "alsl_paired2"
>> +  [(set (match_operand:DI 0 "register_operand" "=&r")
>> +	(plus:DI (match_operand:DI 1 "register_operand" "r")
>> +		 (and:DI (ashift:DI (match_operand:DI 2 "register_operand" "r")
>> +				    (match_operand 3 "const_immalsl_operand" ""))
>> +			 (match_operand 4 "immediate_operand" ""))))]
>> +  "TARGET_64BIT
>> +   && ((INTVAL (operands[4]) >> INTVAL (operands[3])) == 0xffffffff)"
>> +  "bstrpick.d\t%0,%2,31,0\n\talsl.d\t%0,%0,%1,%3"
>> +  [(set_attr "type" "arith")
>> +   (set_attr "mode" "DI")
>> +   (set_attr "insn_count" "2")])
> Same for this pair.
>
>> […]
>> +(define_expand "tablejump"
>> +  [(set (pc)
>> +	(match_operand 0 "register_operand"))
>> +   (use (label_ref (match_operand 1 "")))]
>> +  ""
>> +{
>> +  if (flag_pic)
>> +      operands[0] = expand_simple_binop (Pmode, PLUS, operands[0],
>> +					 gen_rtx_LABEL_REF (Pmode,
>> +							    operands[1]),
>> +					 NULL_RTX, 0, OPTAB_DIRECT);
> Formatting nit: the last four lines should be indented by two spaces fewer.
>
>> +  emit_jump_insn (PMODE_INSN (gen_tablejump, (operands[0], operands[1])));
>> +  DONE;
>> +})
>> +(define_insn "sibcall_internal"
>> +  [(call (mem:SI (match_operand 0 "call_insn_operand" "j,c,a,t,h"))
>> +	 (match_operand 1 "" ""))]
>> +  "SIBLING_CALL_P (insn)"
>> +{
>> +  switch (which_alternative)
>> +    {
>> +    case 0:
>> +      return "jr\t%0";
>> +    case 1:
>> +      if (TARGET_CMODEL_LARGE)
>> +	return "pcaddu18i\t$r12,(%%pcrel(%0+0x20000))>>18\n\t"
>> +	       "jirl\t$r0,$r12,%%pcrel(%0+4)-(%%pcrel(%0+4+0x20000)>>18<<18)";
>> +      else if (TARGET_CMODEL_EXTREME)
>> +	return "la.local\t$r12,$r13,%0\n\tjr\t$r12";
>> +      else
>> +	return "b\t%0";
>> +    case 2:
>> +      if (TARGET_CMODEL_TINY_STATIC)
>> +	return "b\t%0";
>> +      else if (TARGET_CMODEL_EXTREME)
>> +	return "la.global\t$r12,$r13,%0\n\tjr\t$r12";
>> +      else
>> +	return "la.global\t$r12,%0\n\tjr\t$r12";
>> +    case 3:
>> +      if (TARGET_CMODEL_EXTREME)
>> +	return "la.global\t$r12,$r13,%0\n\tjr\t$r12";
>> +      else
>> +	return "la.global\t$r12,%0\n\tjr\t$r12";
>> +    case 4:
>> +      if (TARGET_CMODEL_NORMAL || TARGET_CMODEL_TINY)
>> +	return "b\t%%plt(%0)";
>> +      else if (TARGET_CMODEL_LARGE)
>> +	return "pcaddu18i\t$r12,(%%plt(%0)+0x20000)>>18\n\t"
>> +	       "jirl\t$r0,$r12,%%plt(%0)+4-((%%plt(%0)+(4+0x20000))>>18<<18)";
>> +      else
>> +	{
>> +	  sorry ("cmodel extreme and tiny static not support plt");
> “do not support PLTs”.
>
> Can this be triggered by a certain combination of source code
> and command-line options, or is it really an internal error?
>
>> +	  return "";  /* GCC complains about may fall through.  */
> sorry() isn't a fatal error, so the compiler will continue.
> Returning "" is the right thing to do, but the comment makes it
> sound unreachable.
>
> Same comments for later sorry() + return pairs.
>
>> +	}
>> +    default:
>> +      gcc_unreachable ();
>> +    }
>> +}
>> +  [(set_attr "jirl" "indirect,direct,direct,direct,direct")])
>> +
>> +(define_expand "sibcall_value"
>> +  [(parallel [(set (match_operand 0 "")
>> +		   (call (match_operand 1 "")
>> +			 (match_operand 2 "")))
>> +	      (use (match_operand 3 ""))])]		;; next_arg_reg
>> +  ""
>> +{
>> +  rtx target = loongarch_legitimize_call_address (XEXP (operands[1], 0));
>> +
>> + /*  Handle return values created by loongarch_pass_fpr_pair.  */
> Formatting nit: should be two spaces before “/*” and only one after it.
>
>> +  if (GET_CODE (operands[0]) == PARALLEL && XVECLEN (operands[0], 0) == 2)
>> +    {
>> +      rtx arg1 = XEXP (XVECEXP (operands[0],0, 0), 0);
>> +      rtx arg2 = XEXP (XVECEXP (operands[0],0, 1), 0);
>> +
>> +      emit_call_insn (gen_sibcall_value_multiple_internal (arg1, target,
>> +							   operands[2],
>> +							   arg2));
>> +    }
>> +   else
>> +    {
>> +      /*  Handle return values created by loongarch_return_fpr_single.  */
> Should only be one space after “/*” here too.
>
>> +      if (GET_CODE (operands[0]) == PARALLEL && XVECLEN (operands[0], 0) == 1)
>> +      operands[0] = XEXP (XVECEXP (operands[0], 0, 0), 0);
> The last line should be indented by two spaces more.
>
> Same three comments for call_value.
>
>> […]
>> +;;
>> +;;  ....................
>> +;;
>> +;;	MISC.
>> +;;
>> +;;  ....................
>> +;;
>> +
>> +(define_insn "nop"
>> +  [(const_int 0)]
>> +  ""
>> +  "nop"
>> +  [(set_attr "type"	"nop")
>> +   (set_attr "mode"	"none")])
> Formatting nit: most of the rest of the file uses a space rather than a
> tab after the attribute name.  IMO a space looks nicer.  Same for some
> later patterns.
>
>> +
>> +;; __builtin_loongarch_movfcsr2gr: move the FCSR into operand 0.
>> +(define_insn "loongarch_movfcsr2gr"
>> +  [(set (match_operand:SI 0 "register_operand" "=r")
>> +    (unspec_volatile:SI [(match_operand 1 "const_uimm5_operand")]
>> +    UNSPECV_MOVFCSR2GR))]
> Last two lines look under-indented.
>
>> +  "TARGET_HARD_FLOAT"
>> +  "movfcsr2gr\t%0,$r%1")
>> +
>> […]
>> +(define_insn "stack_tie<mode>"
>> +  [(set (mem:BLK (scratch))
>> +	(unspec:BLK [(match_operand:GPR 0 "register_operand" "r")
>> +		     (match_operand:GPR 1 "register_operand" "r")]
>> +		    UNSPEC_TIE))]
>> +  ""
>> +  ""
>> +  [(set_attr "length" "0")]
>> +)
> Using [(set_attr "type "ghost")] should be slightly better for
> scheduling than setting the length directly.
>
>> +
>> +(define_insn "gpr_restore_return"
>> +  [(return)
>> +   (use (match_operand 0 "pmode_register_operand" ""))
>> +   (const_int 0)]
>> +  ""
>> +  "")
> Might be worth adding a comment here.  Why does this form of return expand
> to no code?
>
>> +
>> +(define_split
>> +  [(match_operand 0 "small_data_pattern")]
>> +  "reload_completed"
>> +  [(match_dup 0)]
>> +  { operands[0] = loongarch_rewrite_small_data (operands[0]); })
>> +
>> +
>> +;; Match paired HI/SI/SF/DFmode load/stores.
>> +(define_insn "*join2_load_store<JOIN_MODE:mode>"
>> +  [(set (match_operand:JOIN_MODE 0 "nonimmediate_operand"
>> +  "=r,f,m,m,r,ZC,r,k,f,k")
>> +	(match_operand:JOIN_MODE 1 "nonimmediate_operand" "m,m,r,f,ZC,r,k,r,k,f"))
>> +   (set (match_operand:JOIN_MODE 2 "nonimmediate_operand"
>> +   "=r,f,m,m,r,ZC,r,k,f,k")
>> +	(match_operand:JOIN_MODE 3 "nonimmediate_operand" "m,m,r,f,ZC,r,k,r,k,f"))]
>> +  "reload_completed"
>> +  {
>> +    bool load_p = (which_alternative == 0 || which_alternative == 1);
>> +    /* Reg-renaming pass reuses base register if it is dead after bonded loads.
>> +       Hardware does not bond those loads, even when they are consecutive.
>> +       However, order of the loads need to be checked for correctness.  */
>> +    if (!load_p || !reg_overlap_mentioned_p (operands[0], operands[1]))
>> +      {
> I'm not sure I understand how these patterns work, but it looks like the
> condition above is trying to work around a later change to the insn by
> regrename, after peephole2 has checked loongarch_load_store_bonding_p.
> If so, you should be able to avoid that by marking the destinations of
> the loads as earlyclobbers, using "&r" instead of "r" for the first
> alternative.  regrename should then preserve the conditions that
> loongarch_load_store_bonding_p checked earlier.
>
> Same for the other patterns.
>
>> +	output_asm_insn (loongarch_output_move (operands[0], operands[1]),
>> +			 operands);
>> +	output_asm_insn (loongarch_output_move (operands[2], operands[3]),
>> +			 &operands[2]);
>> +      }
>> +    else
>> +      {
>> +	output_asm_insn (loongarch_output_move (operands[2], operands[3]),
>> +			 &operands[2]);
>> +	output_asm_insn (loongarch_output_move (operands[0], operands[1]),
>> +			 operands);
>> +      }
>> +    return "";
>> +  }
>> +  [(set_attr "move_type"
>> +  "load,fpload,store,fpstore,load,store,load,store,fpload,fpstore")
>> +   (set_attr "insn_count" "2,2,2,2,2,2,2,2,2,2")])
>> […]
>> +;; This is used for indexing into vectors, and hence only accepts const_int.
>> +(define_predicate "const_0_or_1_operand"
>> +  (and (match_code "const_int")
>> +       (match_test "IN_RANGE (INTVAL (op), 0, 1)")))
> It doesn't look like this is used.  It'd be good to check the other
> predicates to see if any of them can be removed.  In particular…
>
>> […]
>> +(define_predicate "const_8_to_15_operand"
>> +  (and (match_code "const_int")
>> +       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
>> +
>> +(define_predicate "const_16_to_31_operand"
>> +  (and (match_code "const_int")
>> +       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
> …these two don't seem to do what their name suggests.
>
>> […]
>> +(define_predicate "muldiv_target_operand"
>> +		(match_operand 0 "register_operand"))
> This one also seems unused.  IMO using register_operand would be clearer.
>
>> […]
>> +(define_predicate "is_const_call_local_symbol"
>> +  (and (match_operand 0 "const_call_insn_operand")
>> +       (ior (match_test "loongarch_global_symbol_p (op) == 0")
>> +       (match_test "loongarch_symbol_binds_local_p (op) != 0"))
> The indentation looks misleading here: the last line is another
> operand of the ior.
>
> Thanks,
> Richard
>
>> +       (match_test "CONSTANT_P (op)")))
>> […]
Xi Ruoyao March 7, 2022, 8:15 p.m. UTC | #3
On Fri, 2022-03-04 at 15:18 +0800, xuchenghua@loongson.cn wrote:

>         * config/loongarch/loongarch.md: New file.

An ICE happens building OpenSSH-8.9p1.  Investigation shows it's caused
by the flag "-fzero-call-used-regs=".  It's because the compiler
attempts to clear FCCx registers but can't figure out how.

This flag also triggers ICE for other targets (for example, PR 104820
for MIPS), and the related tests (zero-scratch-regs-{8,9,10,11}.c) are
marked dg-skip for many targets.

But it's unfortunate that packages like OpenSSH have already start to
use this flag... I guess they just enabled it once they saw it was
working for i386 :(.  So it's better to solve the problem for a new
target.

A "quick fix" is adding an insn to clear FCCx.  This is enough to build
OpenSSH and make zero-scratch-regs-{8,9,10,11}.c PASS.

diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md
index a9a8bc4b038..76c5ded9fe4 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -2020,6 +2020,12 @@
   DONE;
 })
 
+;; Clear one FCC register
+(define_insn "movfcc" [(set (match_operand:FCC 0 "register_operand" "=z")
+                     (const_int 0))]
+  ""
+  "movgr2cf\t%0,$r0")
+
 ;; Conditional move instructions.
 
 (define_insn "*sel<code><GPR:mode>_using_<GPR2:mode>"
Lulu Cheng March 10, 2022, 6:26 a.m. UTC | #4
Hi,

    We are modifying the code, this support will be

added in the next commit.

Thanks.

在 2022/3/8 上午4:15, Xi Ruoyao 写道:
> On Fri, 2022-03-04 at 15:18 +0800, xuchenghua@loongson.cn wrote:
>
>>          * config/loongarch/loongarch.md: New file.
> An ICE happens building OpenSSH-8.9p1.  Investigation shows it's caused
> by the flag "-fzero-call-used-regs=".  It's because the compiler
> attempts to clear FCCx registers but can't figure out how.
>
> This flag also triggers ICE for other targets (for example, PR 104820
> for MIPS), and the related tests (zero-scratch-regs-{8,9,10,11}.c) are
> marked dg-skip for many targets.
>
> But it's unfortunate that packages like OpenSSH have already start to
> use this flag... I guess they just enabled it once they saw it was
> working for i386 :(.  So it's better to solve the problem for a new
> target.
>
> A "quick fix" is adding an insn to clear FCCx.  This is enough to build
> OpenSSH and make zero-scratch-regs-{8,9,10,11}.c PASS.
>
> diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md
> index a9a8bc4b038..76c5ded9fe4 100644
> --- a/gcc/config/loongarch/loongarch.md
> +++ b/gcc/config/loongarch/loongarch.md
> @@ -2020,6 +2020,12 @@
>     DONE;
>   })
>   
> +;; Clear one FCC register
> +(define_insn "movfcc" [(set (match_operand:FCC 0 "register_operand" "=z")
> +                     (const_int 0))]
> +  ""
> +  "movgr2cf\t%0,$r0")
> +
>   ;; Conditional move instructions.
>   
>   (define_insn "*sel<code><GPR:mode>_using_<GPR2:mode>"
Lulu Cheng March 19, 2022, 9:40 a.m. UTC | #5
在 2022/3/7 上午12:16, Richard Sandiford 写道:
>> +(define_split
>> +  [(match_operand 0 "small_data_pattern")]
>> +  "reload_completed"
>> +  [(match_dup 0)]
>> +  { operands[0] = loongarch_rewrite_small_data (operands[0]); })
>> +
>> +
>> +;; Match paired HI/SI/SF/DFmode load/stores.
>> +(define_insn "*join2_load_store<JOIN_MODE:mode>"
>> +  [(set (match_operand:JOIN_MODE 0 "nonimmediate_operand"
>> +  "=r,f,m,m,r,ZC,r,k,f,k")
>> +	(match_operand:JOIN_MODE 1 "nonimmediate_operand" "m,m,r,f,ZC,r,k,r,k,f"))
>> +   (set (match_operand:JOIN_MODE 2 "nonimmediate_operand"
>> +   "=r,f,m,m,r,ZC,r,k,f,k")
>> +	(match_operand:JOIN_MODE 3 "nonimmediate_operand" "m,m,r,f,ZC,r,k,r,k,f"))]
>> +  "reload_completed"
>> +  {
>> +    bool load_p = (which_alternative == 0 || which_alternative == 1);
>> +    /* Reg-renaming pass reuses base register if it is dead after bonded loads.
>> +       Hardware does not bond those loads, even when they are consecutive.
>> +       However, order of the loads need to be checked for correctness.  */
>> +    if (!load_p || !reg_overlap_mentioned_p (operands[0], operands[1]))
>> +      {
> I'm not sure I understand how these patterns work, but it looks like the
> condition above is trying to work around a later change to the insn by
> regrename, after peephole2 has checked loongarch_load_store_bonding_p.
> If so, you should be able to avoid that by marking the destinations of
> the loads as earlyclobbers, using "&r" instead of "r" for the first
> alternative.  regrename should then preserve the conditions that
> loongarch_load_store_bonding_p checked earlier.
>
> Same for the other patterns.
>
Hi,

I think peephole pass is after reload pass, so peephole pass don't need '&'.


Thanks.
diff mbox series

Patch

diff --git a/gcc/config/loongarch/constraints.md b/gcc/config/loongarch/constraints.md
new file mode 100644
index 00000000000..e3e3f79224b
--- /dev/null
+++ b/gcc/config/loongarch/constraints.md
@@ -0,0 +1,204 @@ 
+;; Constraint definitions for LoongArch.
+;; Copyright (C) 2021-2022 Free Software Foundation, Inc.
+;; Contributed by Loongson Ltd.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+;; Register constraints
+
+;; "a" "A constant call global and noplt address."
+;; "b" <-----unused
+;; "c" "A constant call local address."
+;; "d" <-----unused
+;; "e" JIRL_REGS
+;; "f" FP_REGS
+;; "g" <-----unused
+;; "h" "A constant call plt address."
+;; "i" "Matches a general integer constant." (Global non-architectural)
+;; "j" SIBCALL_REGS
+;; "k" "A memory operand whose address is formed by a base register and
+;;      (optionally scaled) index register."
+;; "l" "A signed 16-bit constant."
+;; "m" "A memory operand whose address is formed by a base register and offset
+;;      that is suitable for use in instructions with the same addressing mode
+;;      as @code{st.w} and @code{ld.w}."
+;; "n" "Matches a non-symbolic integer constant." (Global non-architectural)
+;; "o" "Matches an offsettable memory reference." (Global non-architectural)
+;; "p" "Matches a general address." (Global non-architectural)
+;; "q" CSR_REGS
+;; "r" GENERAL_REGS (Global non-architectural)
+;; "s" "Matches a symbolic integer constant." (Global non-architectural)
+;; "t" "A constant call weak address"
+;; "u" "A signed 52bit constant and low 32-bit is zero (for logic instructions)"
+;; "v" "A signed 64-bit constant and low 44-bit is zero (for logic instructions)."
+;; "w" "Matches any valid memory."
+;; "x" <-----unused
+;; "y" <-----unused
+;; "z" FCC_REGS
+;; "A" <-----unused
+;; "B" <-----unused
+;; "C" <-----unused
+;; "D" <-----unused
+;; "E" "Matches a floating-point constant." (Global non-architectural)
+;; "F" "Matches a floating-point constant." (Global non-architectural)
+;; "G" "Floating-point zero."
+;; "H" <-----unused
+;; "I" "A signed 12-bit constant (for arithmetic instructions)."
+;; "J" "Integer zero."
+;; "K" "An unsigned 12-bit constant (for logic instructions)."
+;; "L" <-----unused
+;; "M" <-----unused
+;; "N" <-----unused
+;; "O" <-----unused
+;; "P" <-----unused
+;; "Q" <-----unused
+;; "R" <-----unused
+;; "S" <-----unused
+;; "T" <-----unused
+;; "U" <-----unused
+;; "V" "Matches a non-offsettable memory reference." (Global non-architectural)
+;; "W" <-----unused
+;; "X" "Matches anything." (Global non-architectural)
+;; "Y" -
+;;    "Yd"
+;;       "A constant @code{move_operand} that can be safely loaded using
+;;	  @code{la}."
+;;    "Yx"
+;; "Z" -
+;;    "ZC"
+;;      "A memory operand whose address is formed by a base register and offset
+;;       that is suitable for use in instructions with the same addressing mode
+;;       as @code{ll.w} and @code{sc.w}."
+;;    "ZB"
+;;      "An address that is held in a general-purpose register.
+;;      The offset is zero"
+;; "<" "Matches a pre-dec or post-dec operand." (Global non-architectural)
+;; ">" "Matches a pre-inc or post-inc operand." (Global non-architectural)
+
+(define_constraint "a"
+  "@internal
+   A constant call global and noplt address."
+  (match_operand 0 "is_const_call_global_noplt_symbol"))
+
+(define_constraint "c"
+  "@internal
+   A constant call local address."
+  (match_operand 0 "is_const_call_local_symbol"))
+
+(define_register_constraint "e" "JIRL_REGS"
+  "@internal")
+
+(define_register_constraint "f" "TARGET_HARD_FLOAT ? FP_REGS : NO_REGS"
+  "A floating-point register (if available).")
+
+(define_constraint "h"
+  "@internal
+   A constant call plt address."
+  (match_operand 0 "is_const_call_plt_symbol"))
+
+(define_register_constraint "j" "SIBCALL_REGS"
+  "@internal")
+
+(define_memory_constraint "k"
+  "A memory operand whose address is formed by a base register and (optionally scaled)
+   index register."
+  (and (match_code "mem")
+       (not (match_test "loongarch_14bit_shifted_offset_address_p (XEXP (op, 0), mode)"))
+       (not (match_test "loongarch_12bit_offset_address_p (XEXP (op, 0), mode)"))))
+
+(define_constraint "l"
+"A signed 16-bit constant."
+(and (match_code "const_int")
+     (match_test "IMM16_OPERAND (ival)")))
+
+(define_memory_constraint "m"
+  "A memory operand whose address is formed by a base register and offset
+   that is suitable for use in instructions with the same addressing mode
+   as @code{st.w} and @code{ld.w}."
+  (and (match_code "mem")
+       (match_test "loongarch_12bit_offset_address_p (XEXP (op, 0), mode)")))
+
+(define_register_constraint "q" "CSR_REGS"
+  "A general-purpose register except for $r0 and $r1 for lcsr.")
+
+(define_constraint "t"
+  "@internal
+   A constant call weak address."
+  (match_operand 0 "is_const_call_weak_symbol"))
+
+(define_constraint "u"
+  "A signed 52bit constant and low 32-bit is zero (for logic instructions)."
+  (and (match_code "const_int")
+       (match_test "LU32I_OPERAND (ival)")))
+
+(define_constraint "v"
+  "A nsigned 64-bit constant and low 44-bit is zero (for logic instructions)."
+  (and (match_code "const_int")
+       (match_test "LU52I_OPERAND (ival)")))
+
+(define_register_constraint "z" "FCC_REGS"
+  "A floating-point condition code register.")
+
+;; Floating-point constraints
+
+(define_constraint "G"
+  "Floating-point zero."
+  (and (match_code "const_double")
+       (match_test "op == CONST0_RTX (mode)")))
+
+;; Integer constraints
+
+(define_constraint "I"
+  "A signed 12-bit constant (for arithmetic instructions)."
+  (and (match_code "const_int")
+       (match_test "IMM12_OPERAND (ival)")))
+
+(define_constraint "J"
+  "Integer zero."
+  (and (match_code "const_int")
+       (match_test "ival == 0")))
+
+(define_constraint "K"
+  "An unsigned 12-bit constant (for logic instructions)."
+  (and (match_code "const_int")
+       (match_test "IMM12_OPERAND_UNSIGNED (ival)")))
+
+(define_constraint "Yd"
+  "@internal
+   A constant @code{move_operand} that can be safely loaded using
+   @code{la}."
+  (and (match_operand 0 "move_operand")
+       (match_test "CONSTANT_P (op)")))
+
+(define_constraint "Yx"
+   "@internal"
+   (match_operand 0 "low_bitmask_operand"))
+
+(define_memory_constraint "ZC"
+  "A memory operand whose address is formed by a base register and offset
+   that is suitable for use in instructions with the same addressing mode
+   as @code{ll.w} and @code{sc.w}."
+  (and (match_code "mem")
+       (match_test "loongarch_14bit_shifted_offset_address_p (XEXP (op, 0), mode)")))
+
+(define_memory_constraint "ZB"
+  "@internal
+  An address that is held in a general-purpose register.
+  The offset is zero"
+  (and (match_code "mem")
+       (match_test "GET_CODE (XEXP (op,0)) == REG")))
+
diff --git a/gcc/config/loongarch/generic.md b/gcc/config/loongarch/generic.md
new file mode 100644
index 00000000000..a3a01f674f3
--- /dev/null
+++ b/gcc/config/loongarch/generic.md
@@ -0,0 +1,132 @@ 
+;; Generic DFA-based pipeline description for LoongArch targets
+;; Copyright (C) 2021-2022 Free Software Foundation, Inc.
+;; Contributed by Loongson Ltd.
+;; Based on MIPS target for GNU compiler.
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+;; License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+
+;; Pipeline descriptions.
+;;
+;; generic.md provides a fallback for processors without a specific
+;; pipeline description.  It is derived from the old define_function_unit
+;; version and uses the "alu" and "imuldiv" units declared below.
+;;
+;; Some of the processor-specific files are also derived from old
+;; define_function_unit descriptions and simply override the parts of
+;; generic.md that don't apply.  The other processor-specific files
+;; are self-contained.
+(define_automaton "alu,imuldiv")
+
+(define_cpu_unit "alu" "alu")
+(define_cpu_unit "imuldiv" "imuldiv")
+
+;; Ghost instructions produce no real code.
+;; They exist purely to express an effect on dataflow.
+(define_insn_reservation "ghost" 0
+  (eq_attr "type" "ghost")
+  "nothing")
+
+;; This file is derived from the old define_function_unit description.
+;; Each reservation can be overridden on a processor-by-processor basis.
+
+(define_insn_reservation "generic_alu" 1
+  (eq_attr "type" "unknown,prefetch,prefetchx,condmove,const,arith,
+		   shift,slt,clz,trap,multi,nop,logical,signext,move")
+  "alu")
+
+(define_insn_reservation "generic_load" 3
+  (eq_attr "type" "load,fpload,fpidxload")
+  "alu")
+
+(define_insn_reservation "generic_store" 1
+  (eq_attr "type" "store,fpstore,fpidxstore")
+  "alu")
+
+(define_insn_reservation "generic_xfer" 2
+  (eq_attr "type" "mftg,mgtf")
+  "alu")
+
+(define_insn_reservation "generic_branch" 1
+  (eq_attr "type" "branch,jump,call")
+  "alu")
+
+(define_insn_reservation "generic_imul" 17
+  (eq_attr "type" "imul")
+  "imuldiv*17")
+
+(define_insn_reservation "generic_fcvt" 1
+  (eq_attr "type" "fcvt")
+  "alu")
+
+(define_insn_reservation "generic_fmove" 2
+  (eq_attr "type" "fabs,fneg,fmove")
+  "alu")
+
+(define_insn_reservation "generic_fcmp" 3
+  (eq_attr "type" "fcmp")
+  "alu")
+
+(define_insn_reservation "generic_fadd" 4
+  (eq_attr "type" "fadd")
+  "alu")
+
+(define_insn_reservation "generic_fmul_single" 7
+  (and (eq_attr "type" "fmul,fmadd")
+       (eq_attr "mode" "SF"))
+  "alu")
+
+(define_insn_reservation "generic_fmul_double" 8
+  (and (eq_attr "type" "fmul,fmadd")
+       (eq_attr "mode" "DF"))
+  "alu")
+
+(define_insn_reservation "generic_fdiv_single" 23
+  (and (eq_attr "type" "fdiv,frdiv")
+       (eq_attr "mode" "SF"))
+  "alu")
+
+(define_insn_reservation "generic_fdiv_double" 36
+  (and (eq_attr "type" "fdiv,frdiv")
+       (eq_attr "mode" "DF"))
+  "alu")
+
+(define_insn_reservation "generic_fsqrt_single" 54
+  (and (eq_attr "type" "fsqrt,frsqrt")
+       (eq_attr "mode" "SF"))
+  "alu")
+
+(define_insn_reservation "generic_fsqrt_double" 112
+  (and (eq_attr "type" "fsqrt,frsqrt")
+       (eq_attr "mode" "DF"))
+  "alu")
+
+(define_insn_reservation "generic_atomic" 10
+  (eq_attr "type" "atomic")
+  "alu")
+
+;; Sync loop consists of (in order)
+;; (1) optional sync,
+;; (2) LL instruction,
+;; (3) branch and 1-2 ALU instructions,
+;; (4) SC instruction,
+;; (5) branch and ALU instruction.
+;; The net result of this reservation is a big delay with a flush of
+;; ALU pipeline.
+(define_insn_reservation "generic_sync_loop" 40
+  (eq_attr "type" "syncloop")
+  "alu*39")
diff --git a/gcc/config/loongarch/la464.md b/gcc/config/loongarch/la464.md
new file mode 100644
index 00000000000..ae3808b51bb
--- /dev/null
+++ b/gcc/config/loongarch/la464.md
@@ -0,0 +1,132 @@ 
+;; Pipeline model for LoongArch LA464 cores.
+
+;; Copyright (C) 2021-2022 Free Software Foundation, Inc.
+;; Contributed by Loongson Ltd.
+
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+;; License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+;; Uncomment the following line to output automata for debugging.
+;; (automata_option "v")
+
+;; Automaton for integer instructions.
+(define_automaton "la464_a_alu")
+
+;; Automaton for floating-point instructions.
+(define_automaton "la464_a_falu")
+
+;; Automaton for memory operations.
+(define_automaton "la464_a_mem")
+
+;; Describe the resources.
+
+(define_cpu_unit "la464_alu1" "la464_a_alu")
+(define_cpu_unit "la464_alu2" "la464_a_alu")
+(define_cpu_unit "la464_mem1" "la464_a_mem")
+(define_cpu_unit "la464_mem2" "la464_a_mem")
+(define_cpu_unit "la464_falu1" "la464_a_falu")
+(define_cpu_unit "la464_falu2" "la464_a_falu")
+
+;; Describe instruction reservations.
+
+(define_insn_reservation "la464_arith" 1
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "arith,clz,const,logical,
+			move,nop,shift,signext,slt"))
+  "la464_alu1 | la464_alu2")
+
+(define_insn_reservation "la464_branch" 1
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "branch,jump,call,condmove,trap"))
+  "la464_alu1 | la464_alu2")
+
+(define_insn_reservation "la464_imul" 7
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "imul"))
+  "la464_alu1 | la464_alu2")
+
+(define_insn_reservation "la464_idiv_si" 12
+  (and (match_test "TARGET_ARCH_LA464")
+       (and (eq_attr "type" "idiv")
+	    (eq_attr "mode" "SI")))
+  "la464_alu1 | la464_alu2")
+
+(define_insn_reservation "la464_idiv_di" 25
+  (and (match_test "TARGET_ARCH_LA464")
+       (and (eq_attr "type" "idiv")
+	    (eq_attr "mode" "DI")))
+  "la464_alu1 | la464_alu2")
+
+(define_insn_reservation "la464_load" 4
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "load"))
+  "la464_mem1 | la464_mem2")
+
+(define_insn_reservation "la464_gpr_fp" 16
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "mftg,mgtf"))
+  "la464_mem1")
+
+(define_insn_reservation "la464_fpload" 4
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "fpload"))
+  "la464_mem1 | la464_mem2")
+
+(define_insn_reservation "la464_prefetch" 0
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "prefetch,prefetchx"))
+  "la464_mem1 | la464_mem2")
+
+(define_insn_reservation "la464_store" 0
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "store,fpstore,fpidxstore"))
+  "la464_mem1 | la464_mem2")
+
+(define_insn_reservation "la464_fadd" 4
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "fadd,fmul,fmadd"))
+  "la464_falu1 | la464_falu2")
+
+(define_insn_reservation "la464_fcmp" 2
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "fabs,fcmp,fmove,fneg"))
+  "la464_falu1 | la464_falu2")
+
+(define_insn_reservation "la464_fcvt" 4
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "fcvt"))
+  "la464_falu1 | la464_falu2")
+
+(define_insn_reservation "la464_fdiv_sf" 12
+  (and (match_test "TARGET_ARCH_LA464")
+       (and (eq_attr "type" "fdiv,frdiv,fsqrt,frsqrt")
+	    (eq_attr "mode" "SF")))
+  "la464_falu1 | la464_falu2")
+
+(define_insn_reservation "la464_fdiv_df" 19
+  (and (match_test "TARGET_ARCH_LA464")
+       (and (eq_attr "type" "fdiv,frdiv,fsqrt,frsqrt")
+	    (eq_attr "mode" "DF")))
+  "la464_falu1 | la464_falu2")
+
+;; Force single-dispatch for unknown or multi.
+(define_insn_reservation "la464_unknown" 1
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "unknown,multi,atomic,syncloop"))
+  "la464_alu1 + la464_alu2 + la464_falu1
+   + la464_falu2 + la464_mem1 + la464_mem2")
+
+;; End of DFA-based pipeline description for la464
diff --git a/gcc/config/loongarch/loongarch-ftypes.def b/gcc/config/loongarch/loongarch-ftypes.def
new file mode 100644
index 00000000000..2137a3d2c9f
--- /dev/null
+++ b/gcc/config/loongarch/loongarch-ftypes.def
@@ -0,0 +1,106 @@ 
+/* Definitions of prototypes for LoongArch built-in functions.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+   Based on MIPS target for GNU ckompiler.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+/* Invoke DEF_LARCH_FTYPE (NARGS, LIST) for each prototype used by
+   LoongArch built-in functions, where:
+
+      NARGS is the number of arguments.
+      LIST contains the return-type code followed by the codes for each
+      argument type.
+
+   Argument- and return-type codes are either modes or one of the following:
+
+      VOID for void_type_node
+      INT for integer_type_node
+      POINTER for ptr_type_node
+
+   (we don't use PTR because that's a ANSI-compatibillity macro).
+
+   Please keep this list lexicographically sorted by the LIST argument.  */
+
+DEF_LARCH_FTYPE (1, (DF, DF))
+DEF_LARCH_FTYPE (2, (DF, DF, DF))
+
+DEF_LARCH_FTYPE (1, (DI, DI))
+DEF_LARCH_FTYPE (2, (DI, DI, DI))
+DEF_LARCH_FTYPE (3, (DI, DI, DI, QI))
+DEF_LARCH_FTYPE (2, (DI, DI, SI))
+DEF_LARCH_FTYPE (3, (DI, DI, SI, SI))
+DEF_LARCH_FTYPE (2, (DI, DI, UQI))
+DEF_LARCH_FTYPE (3, (DI, DI, USI, USI))
+DEF_LARCH_FTYPE (2, (DI, POINTER, SI))
+DEF_LARCH_FTYPE (1, (DI, SI))
+DEF_LARCH_FTYPE (2, (DI, SI, SI))
+DEF_LARCH_FTYPE (1, (DI, UQI))
+DEF_LARCH_FTYPE (2, (DI, USI, USI))
+
+DEF_LARCH_FTYPE (2, (HI, HI, HI))
+
+DEF_LARCH_FTYPE (2, (INT, DF, DF))
+DEF_LARCH_FTYPE (2, (INT, SF, SF))
+
+DEF_LARCH_FTYPE (2, (QI, QI, QI))
+
+DEF_LARCH_FTYPE (1, (SF, SF))
+DEF_LARCH_FTYPE (2, (SF, SF, SF))
+
+DEF_LARCH_FTYPE (2, (SI, DI, SI))
+DEF_LARCH_FTYPE (2, (SI, HI, SI))
+DEF_LARCH_FTYPE (2, (SI, POINTER, SI))
+DEF_LARCH_FTYPE (2, (SI, QI, SI))
+DEF_LARCH_FTYPE (1, (SI, SI))
+DEF_LARCH_FTYPE (2, (SI, SI, SI))
+DEF_LARCH_FTYPE (3, (SI, SI, SI, QI))
+DEF_LARCH_FTYPE (3, (SI, SI, SI, SI))
+DEF_LARCH_FTYPE (2, (SI, SI, UQI))
+DEF_LARCH_FTYPE (1, (SI, UDI))
+DEF_LARCH_FTYPE (1, (SI, UQI))
+DEF_LARCH_FTYPE (1, (SI, VOID))
+
+DEF_LARCH_FTYPE (2, (UDI, UDI, UDI))
+DEF_LARCH_FTYPE (3, (UDI, UDI, UDI, USI))
+DEF_LARCH_FTYPE (2, (UDI, UDI, USI))
+DEF_LARCH_FTYPE (1, (UDI, USI))
+
+DEF_LARCH_FTYPE (1, (UHI, USI))
+
+DEF_LARCH_FTYPE (1, (UQI, USI))
+
+DEF_LARCH_FTYPE (1, (USI, UQI))
+DEF_LARCH_FTYPE (1, (USI, USI))
+DEF_LARCH_FTYPE (2, (USI, USI, USI))
+DEF_LARCH_FTYPE (3, (USI, USI, USI, USI))
+DEF_LARCH_FTYPE (1, (USI, VOID))
+
+DEF_LARCH_FTYPE (2, (VOID, DI, DI))
+DEF_LARCH_FTYPE (2, (VOID, DI, UQI))
+DEF_LARCH_FTYPE (2, (VOID, SI, CVPOINTER))
+DEF_LARCH_FTYPE (2, (VOID, SI, SI))
+DEF_LARCH_FTYPE (2, (VOID, SI, UQI))
+DEF_LARCH_FTYPE (2, (VOID, UDI, USI))
+DEF_LARCH_FTYPE (2, (VOID, UHI, USI))
+DEF_LARCH_FTYPE (2, (VOID, UQI, SI))
+DEF_LARCH_FTYPE (2, (VOID, UQI, USI))
+DEF_LARCH_FTYPE (1, (VOID, USI))
+DEF_LARCH_FTYPE (3, (VOID, USI, UDI, SI))
+DEF_LARCH_FTYPE (2, (VOID, USI, UQI))
+DEF_LARCH_FTYPE (2, (VOID, USI, USI))
+DEF_LARCH_FTYPE (3, (VOID, USI, USI, SI))
diff --git a/gcc/config/loongarch/loongarch-modes.def b/gcc/config/loongarch/loongarch-modes.def
new file mode 100644
index 00000000000..c0261b2cc83
--- /dev/null
+++ b/gcc/config/loongarch/loongarch-modes.def
@@ -0,0 +1,29 @@ 
+/* LoongArch extra machine modes.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+   Based on MIPS target for GNU compiler.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+FLOAT_MODE (TF, 16, ieee_quad_format);
+
+VECTOR_MODES (FLOAT, 8);      /*       V4HF V2SF */
+
+/* For floating point conditions in FCC registers.  */
+CC_MODE (FCC);
+
+INT_MODE (OI, 32);
diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md
new file mode 100644
index 00000000000..a9a8bc4b038
--- /dev/null
+++ b/gcc/config/loongarch/loongarch.md
@@ -0,0 +1,3712 @@ 
+;; Machine Description for LoongArch for GNU compiler.
+;; Copyright (C) 2021-2022 Free Software Foundation, Inc.
+;; Contributed by Loongson Ltd.
+;; Based on MIPS target for GNU compiler.
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_c_enum "unspec" [
+  ;; Integer operations that are too cumbersome to describe directly.
+  UNSPEC_REVB_2H
+  UNSPEC_REVB_4H
+  UNSPEC_REVH_D
+
+  ;; Floating-point moves.
+  UNSPEC_LOAD_LOW
+  UNSPEC_LOAD_HIGH
+  UNSPEC_STORE_WORD
+  UNSPEC_MOVGR2FRH
+  UNSPEC_MOVFRH2GR
+
+  ;; Floating point unspecs.
+  UNSPEC_FRINT
+  UNSPEC_FCLASS
+
+  ;; Override return address for exception handling.
+  UNSPEC_EH_RETURN
+
+  ;; Bit operation
+  UNSPEC_BYTEPICK_W
+  UNSPEC_BYTEPICK_D
+  UNSPEC_BITREV_4B
+  UNSPEC_BITREV_8B
+
+  ;; TLS
+  UNSPEC_TLS_GD
+  UNSPEC_TLS_LD
+  UNSPEC_TLS_LE
+  UNSPEC_TLS_IE
+
+  ;; Stack tie
+  UNSPEC_TIE
+
+  ;; CRC
+  UNSPEC_CRC
+  UNSPEC_CRCC
+])
+
+(define_c_enum "unspecv" [
+  ;; Blockage and synchronisation.
+  UNSPECV_BLOCKAGE
+  UNSPECV_DBAR
+  UNSPECV_IBAR
+
+  ;; CPUCFG
+  UNSPECV_CPUCFG
+  UNSPECV_ASRTLE_D
+  UNSPECV_ASRTGT_D
+
+  ;; Privileged instructions
+  UNSPECV_CSRRD
+  UNSPECV_CSRWR
+  UNSPECV_CSRXCHG
+  UNSPECV_IOCSRRD
+  UNSPECV_IOCSRWR
+  UNSPECV_CACOP
+  UNSPECV_LDDIR
+  UNSPECV_LDPTE
+  UNSPECV_ERTN
+
+  ;; Stack checking.
+  UNSPECV_PROBE_STACK_RANGE
+
+  ;; Floating-point environment.
+  UNSPECV_MOVFCSR2GR
+  UNSPECV_MOVGR2FCSR
+
+])
+
+(define_constants
+  [(RETURN_ADDR_REGNUM		1)
+   (T0_REGNUM			12)
+   (T1_REGNUM			13)
+   (S0_REGNUM			23)
+
+   ;; PIC long branch sequences are never longer than 100 bytes.
+   (MAX_PIC_BRANCH_LENGTH	100)
+])
+
+(include "predicates.md")
+(include "constraints.md")
+
+;; ....................
+;;
+;;	Attributes
+;;
+;; ....................
+
+(define_attr "got" "unset,load"
+  (const_string "unset"))
+
+;; For jirl instructions, this attribute is DIRECT when the target address
+;; is symbolic and INDIRECT when it is a register.
+(define_attr "jirl" "unset,direct,indirect"
+  (const_string "unset"))
+
+
+;; Classification of moves, extensions and truncations.  Most values
+;; are as for "type" (see below) but there are also the following
+;; move-specific values:
+;;
+;; sll0		"slli.w DEST,SRC,0", which on 64-bit targets is guaranteed
+;;		to produce a sign-extended DEST, even if SRC is not
+;;		properly sign-extended
+;; pick_ins	BSTRPICK.W, BSTRPICK.D, BSTRINS.W or BSTRINS.D instruction
+;; andi		a single ANDI instruction
+;; shift_shift	a shift left followed by a shift right
+;;
+;; This attribute is used to determine the instruction's length and
+;; scheduling type.  For doubleword moves, the attribute always describes
+;; the split instructions; in some cases, it is more appropriate for the
+;; scheduling type to be "multi" instead.
+(define_attr "move_type"
+  "unknown,load,fpload,store,fpstore,mgtf,mftg,imul,move,fmove,
+   const,signext,pick_ins,logical,arith,sll0,andi,shift_shift"
+  (const_string "unknown"))
+
+(define_attr "alu_type" "unknown,add,sub,not,nor,and,or,xor"
+  (const_string "unknown"))
+
+;; Main data type used by the insn
+(define_attr "mode" "unknown,none,QI,HI,SI,DI,TI,OI,SF,DF,TF,FCC"
+  (const_string "unknown"))
+
+;; True if the main data type is twice the size of a word.
+(define_attr "dword_mode" "no,yes"
+  (cond [(and (eq_attr "mode" "DI,DF")
+	      (not (match_test "TARGET_64BIT")))
+	 (const_string "yes")
+
+	 (and (eq_attr "mode" "TI,TF")
+	      (match_test "TARGET_64BIT"))
+	 (const_string "yes")]
+	(const_string "no")))
+
+;; True if the main data type is four times of the size of a word.
+(define_attr "qword_mode" "no,yes"
+  (cond [(and (eq_attr "mode" "TI,TF")
+	      (not (match_test "TARGET_64BIT")))
+	 (const_string "yes")]
+	(const_string "no")))
+
+;; True if the main data type is eight times of the size of a word.
+(define_attr "oword_mode" "no,yes"
+  (cond [(and (eq_attr "mode" "OI")
+	      (not (match_test "TARGET_64BIT")))
+	 (const_string "yes")]
+	(const_string "no")))
+
+;; Classification of each insn.
+;; branch	conditional branch
+;; jump		unconditional jump
+;; call		unconditional call
+;; load		load instruction(s)
+;; fpload	floating point load
+;; fpidxload    floating point indexed load
+;; store	store instruction(s)
+;; fpstore	floating point store
+;; fpidxstore	floating point indexed store
+;; prefetch	memory prefetch (register + offset)
+;; prefetchx	memory indexed prefetch (register + register)
+;; condmove	conditional moves
+;; mgtf		move general-purpose register to floating point register
+;; mftg		move floating point register to general-purpose register
+;; const	load constant
+;; arith	integer arithmetic instructions
+;; logical      integer logical instructions
+;; shift	integer shift instructions
+;; slt		set less than instructions
+;; signext      sign extend instructions
+;; clz		the clz and clo instructions
+;; trap		trap if instructions
+;; imul		integer multiply
+;; idiv		integer divide
+;; move		integer move
+;; fmove	floating point register move
+;; fadd		floating point add/subtract
+;; fmul		floating point multiply
+;; fmadd	floating point multiply-add
+;; fdiv		floating point divide
+;; frdiv	floating point reciprocal divide
+;; fabs		floating point absolute value
+;; fneg		floating point negation
+;; fcmp		floating point compare
+;; fcvt		floating point convert
+;; fsqrt	floating point square root
+;; frsqrt       floating point reciprocal square root
+;; multi	multiword sequence (or user asm statements)
+;; atomic	atomic memory update instruction
+;; syncloop	memory atomic operation implemented as a sync loop
+;; nop		no operation
+;; ghost	an instruction that produces no real code
+(define_attr "type"
+  "unknown,branch,jump,call,load,fpload,fpidxload,store,fpstore,fpidxstore,
+   prefetch,prefetchx,condmove,mgtf,mftg,const,arith,logical,
+   shift,slt,signext,clz,trap,imul,idiv,move,
+   fmove,fadd,fmul,fmadd,fdiv,frdiv,fabs,fneg,fcmp,fcvt,fsqrt,
+   frsqrt,accext,accmod,multi,atomic,syncloop,nop,ghost"
+  (cond [(eq_attr "jirl" "!unset") (const_string "call")
+	 (eq_attr "got" "load") (const_string "load")
+
+	 (eq_attr "alu_type" "add,sub") (const_string "arith")
+
+	 (eq_attr "alu_type" "not,nor,and,or,xor") (const_string "logical")
+
+	 ;; If a doubleword move uses these expensive instructions,
+	 ;; it is usually better to schedule them in the same way
+	 ;; as the singleword form, rather than as "multi".
+	 (eq_attr "move_type" "load") (const_string "load")
+	 (eq_attr "move_type" "fpload") (const_string "fpload")
+	 (eq_attr "move_type" "store") (const_string "store")
+	 (eq_attr "move_type" "fpstore") (const_string "fpstore")
+	 (eq_attr "move_type" "mgtf") (const_string "mgtf")
+	 (eq_attr "move_type" "mftg") (const_string "mftg")
+
+	 ;; These types of move are always single insns.
+	 (eq_attr "move_type" "imul") (const_string "imul")
+	 (eq_attr "move_type" "fmove") (const_string "fmove")
+	 (eq_attr "move_type" "signext") (const_string "signext")
+	 (eq_attr "move_type" "pick_ins") (const_string "arith")
+	 (eq_attr "move_type" "arith") (const_string "arith")
+	 (eq_attr "move_type" "logical") (const_string "logical")
+	 (eq_attr "move_type" "sll0") (const_string "shift")
+	 (eq_attr "move_type" "andi") (const_string "logical")
+
+	 ;; These types of move are always split.
+	 (eq_attr "move_type" "shift_shift")
+	   (const_string "multi")
+
+	 ;; These types of move are split for octaword modes only.
+	 (and (eq_attr "move_type" "move,const")
+	      (eq_attr "oword_mode" "yes"))
+	   (const_string "multi")
+
+	 ;; These types of move are split for quadword modes only.
+	 (and (eq_attr "move_type" "move,const")
+	      (eq_attr "qword_mode" "yes"))
+	   (const_string "multi")
+
+	 ;; These types of move are split for doubleword modes only.
+	 (and (eq_attr "move_type" "move,const")
+	      (eq_attr "dword_mode" "yes"))
+	   (const_string "multi")
+	 (eq_attr "move_type" "move") (const_string "move")
+	 (eq_attr "move_type" "const") (const_string "const")]
+	(const_string "unknown")))
+
+;; Mode for conversion types (fcvt)
+;; I2S	integer to float single (SI/DI to SF)
+;; I2D	integer to float double (SI/DI to DF)
+;; S2I	float to integer (SF to SI/DI)
+;; D2I	float to integer (DF to SI/DI)
+;; D2S	double to float single
+;; S2D	float single to double
+
+(define_attr "cnv_mode" "unknown,I2S,I2D,S2I,D2I,D2S,S2D"
+  (const_string "unknown"))
+
+(define_attr "compression" "none,all"
+  (const_string "none"))
+
+;; The number of individual instructions that a non-branch pattern generates
+(define_attr "insn_count" ""
+  (cond [;; "Ghost" instructions occupy no space.
+	 (eq_attr "type" "ghost")
+	 (const_int 0)
+
+	 ;; Check for doubleword moves that are decomposed into two
+	 ;; instructions.
+	 (and (eq_attr "move_type" "mgtf,mftg,move")
+	      (eq_attr "dword_mode" "yes"))
+	 (const_int 2)
+
+	 ;; Check for quadword moves that are decomposed into four
+	 ;; instructions.
+	 (and (eq_attr "move_type" "mgtf,mftg,move")
+	      (eq_attr "qword_mode" "yes"))
+	 (const_int 4)
+
+	 ;; Check for Octaword moves that are decomposed into eight
+	 ;; instructions.
+	 (and (eq_attr "move_type" "mgtf,mftg,move")
+	      (eq_attr "oword_mode" "yes"))
+	 (const_int 8)
+
+	 ;; Constants, loads and stores are handled by external routines.
+	 (and (eq_attr "move_type" "const")
+	      (eq_attr "dword_mode" "yes"))
+	 (symbol_ref "loongarch_split_const_insns (operands[1])")
+	 (eq_attr "move_type" "const")
+	 (symbol_ref "loongarch_const_insns (operands[1])")
+	 (eq_attr "move_type" "load,fpload")
+	 (symbol_ref "loongarch_load_store_insns (operands[1], insn)")
+	 (eq_attr "move_type" "store,fpstore")
+	 (symbol_ref "loongarch_load_store_insns (operands[0], insn)")
+
+	 (eq_attr "type" "idiv")
+	 (symbol_ref "loongarch_idiv_insns (GET_MODE (PATTERN (insn)))")]
+(const_int 1)))
+
+;; Length of instruction in bytes.
+(define_attr "length" ""
+   (cond [
+	  ;; Branch futher than +/- 128 KiB require two instructions.
+	  (eq_attr "type" "branch")
+	  (if_then_else (and (le (minus (match_dup 0) (pc)) (const_int 131064))
+			     (le (minus (pc) (match_dup 0)) (const_int 131068)))
+	  (const_int 4)
+	  (const_int 8))
+	  ](symbol_ref "get_attr_insn_count (insn) * 4")))
+
+;; Describe a user's asm statement.
+(define_asm_attributes
+  [(set_attr "type" "multi")])
+
+;; This mode iterator allows 32-bit and 64-bit GPR patterns to be generated
+;; from the same template.
+(define_mode_iterator GPR [SI (DI "TARGET_64BIT")])
+
+;; A copy of GPR that can be used when a pattern has two independent
+;; modes.
+(define_mode_iterator GPR2 [SI (DI "TARGET_64BIT")])
+
+;; Likewise, but for XLEN-sized quantities.
+(define_mode_iterator X [(SI "!TARGET_64BIT") (DI "TARGET_64BIT")])
+
+;; This mode iterator allows 16-bit and 32-bit GPR patterns and 32-bit 64-bit
+;; FPR patterns to be generated from the same template.
+(define_mode_iterator JOIN_MODE [HI
+				 SI
+				 (SF "TARGET_HARD_FLOAT")
+				 (DF "TARGET_DOUBLE_FLOAT")])
+
+;; This mode iterator allows :P to be used for patterns that operate on
+;; pointer-sized quantities.  Exactly one of the two alternatives will match.
+(define_mode_iterator P [(SI "Pmode == SImode") (DI "Pmode == DImode")])
+
+;; 64-bit modes for which we provide move patterns.
+(define_mode_iterator MOVE64 [DI DF])
+
+;; 128-bit modes for which we provide move patterns on 64-bit targets.
+(define_mode_iterator MOVE128 [TI TF])
+
+;; Iterator for sub-32-bit integer modes.
+(define_mode_iterator SHORT [QI HI])
+
+;; Likewise the 64-bit truncate-and-shift patterns.
+(define_mode_iterator SUBDI [QI HI SI])
+
+;; This mode iterator allows the QI HI SI and DI extension patterns to be
+(define_mode_iterator QHWD [QI HI SI (DI "TARGET_64BIT")])
+
+;; Iterator for hardware-supported floating-point modes.
+(define_mode_iterator ANYF [(SF "TARGET_HARD_FLOAT")
+			    (DF "TARGET_DOUBLE_FLOAT")])
+
+;; A floating-point mode for which moves involving FPRs may need to be split.
+(define_mode_iterator SPLITF
+  [(DF "!TARGET_64BIT && TARGET_DOUBLE_FLOAT")
+   (DI "!TARGET_64BIT && TARGET_DOUBLE_FLOAT")
+   (TF "TARGET_64BIT && TARGET_DOUBLE_FLOAT")])
+
+;; In GPR templates, a string like "mul.<d>" will expand to "mul" in the
+;; 32-bit "mul.w" and "mul.d" in the 64-bit version.
+(define_mode_attr d [(SI "w") (DI "d")])
+
+;; This attribute gives the length suffix for a load or store instruction.
+;; The same suffixes work for zero and sign extensions.
+(define_mode_attr size [(QI "b") (HI "h") (SI "w") (DI "d")])
+(define_mode_attr SIZE [(QI "B") (HI "H") (SI "W") (DI "D")])
+
+;; This attributes gives the mode mask of a SHORT.
+(define_mode_attr mask [(QI "0x00ff") (HI "0xffff")])
+
+;; This attributes gives the size (bits) of a SHORT.
+(define_mode_attr qi_hi [(QI "7") (HI "15")])
+
+;; Instruction names for stores.
+(define_mode_attr store [(QI "sb") (HI "sh") (SI "sw") (DI "sd")])
+
+;; Similarly for LoongArch indexed FPR loads and stores.
+(define_mode_attr floadx [(SF "fldx.s") (DF "fldx.d") (V2SF "fldx.d")])
+(define_mode_attr fstorex [(SF "fstx.s") (DF "fstx.d") (V2SF "fstx.d")])
+
+;; Similarly for LoongArch indexed GPR loads and stores.
+(define_mode_attr loadx [(QI "ldx.b")
+			 (HI "ldx.h")
+			 (SI "ldx.w")
+			 (DI "ldx.d")])
+(define_mode_attr storex [(QI "stx.b")
+			  (HI "stx.h")
+			  (SI "stx.w")
+			  (DI "stx.d")])
+
+;; This attribute gives the format suffix for floating-point operations.
+(define_mode_attr fmt [(SF "s") (DF "d")])
+
+;; This attribute gives the upper-case mode name for one unit of a
+;; floating-point mode or vector mode.
+(define_mode_attr UNITMODE [(SF "SF") (DF "DF") (V2SF "SF")])
+
+;; This attribute gives the integer mode that has half the size of
+;; the controlling mode.
+(define_mode_attr HALFMODE [(DF "SI") (DI "SI") (V2SF "SI") (TF "DI")])
+
+;; This attribute gives the integer prefix for some instructions templates.
+(define_mode_attr p [(SI "") (DI "d")])
+
+;; This code iterator allows signed and unsigned widening multiplications
+;; to use the same template.
+(define_code_iterator any_extend [sign_extend zero_extend])
+
+;; This code iterator allows the two right shift instructions to be
+;; generated from the same template.
+(define_code_iterator any_shiftrt [ashiftrt lshiftrt])
+
+;; This code iterator allows the three shift instructions to be generated
+;; from the same template.
+(define_code_iterator any_shift [ashift ashiftrt lshiftrt])
+
+;; This code iterator allows the three bitwise instructions to be generated
+;; from the same template.
+(define_code_iterator any_bitwise [and ior xor])
+
+;; This code iterator allows unsigned and signed division to be generated
+;; from the same template.
+(define_code_iterator any_div [div udiv mod umod])
+
+;; This code iterator allows all native floating-point comparisons to be
+;; generated from the same template.
+(define_code_iterator fcond [unordered uneq unlt unle eq lt le
+			     ordered ltgt ne ge gt unge ungt])
+
+;; Equality operators.
+(define_code_iterator equality_op [eq ne])
+
+;; These code iterators allow the signed and unsigned scc operations to use
+;; the same template.
+(define_code_iterator any_gt [gt gtu])
+(define_code_iterator any_ge [ge geu])
+(define_code_iterator any_lt [lt ltu])
+(define_code_iterator any_le [le leu])
+
+(define_code_iterator any_return [return simple_return])
+
+;; <u> expands to an empty string when doing a signed operation and
+;; "u" when doing an unsigned operation.
+(define_code_attr u [(sign_extend "") (zero_extend "u")
+		     (div "") (udiv "u")
+		     (mod "") (umod "u")
+		     (gt "") (gtu "u")
+		     (ge "") (geu "u")
+		     (lt "") (ltu "u")
+		     (le "") (leu "u")])
+
+;; <U> is like <u> except uppercase.
+(define_code_attr U [(sign_extend "") (zero_extend "U")])
+
+;; <su> is like <u>, but the signed form expands to "s" rather than "".
+(define_code_attr su [(sign_extend "s") (zero_extend "u")])
+
+;; <optab> expands to the name of the optab for a particular code.
+(define_code_attr optab [(ashift "ashl")
+			 (ashiftrt "ashr")
+			 (lshiftrt "lshr")
+			 (ior "ior")
+			 (xor "xor")
+			 (and "and")
+			 (plus "add")
+			 (minus "sub")
+			 (mult "mul")
+			 (div "div")
+			 (udiv "udiv")
+			 (mod "mod")
+			 (umod "umod")
+			 (return "return")
+			 (simple_return "simple_return")])
+
+;; <insn> expands to the name of the insn that implements a particular code.
+(define_code_attr insn [(ashift "sll")
+			(ashiftrt "sra")
+			(lshiftrt "srl")
+			(ior "or")
+			(xor "xor")
+			(and "and")
+			(plus "addu")
+			(minus "subu")
+			(div "div")
+			(udiv "div")
+			(mod "mod")
+			(umod "mod")])
+
+;; <fcond> is the fcmp.cond.fmt condition associated with a particular code.
+(define_code_attr fcond [(unordered "cun")
+			 (uneq "cueq")
+			 (unlt "cult")
+			 (unle "cule")
+			 (eq "ceq")
+			 (lt "slt")
+			 (le "sle")
+			 (ordered "cor")
+			 (ltgt "sne")
+			 (ne "cune")
+			 (ge "sge")
+			 (gt "sgt")
+			 (unge "cuge")
+			 (ungt "cugt")])
+
+;; The sel mnemonic to use depending on the condition test.
+(define_code_attr sel [(eq "masknez") (ne "maskeqz")])
+(define_code_attr selinv [(eq "maskeqz") (ne "masknez")])
+
+;;
+;;  ....................
+;;
+;;	CONDITIONAL TRAPS
+;;
+;;  ....................
+;;
+
+(define_insn "trap"
+  [(trap_if (const_int 1) (const_int 0))]
+  ""
+{
+  return "break\t0";
+}
+  [(set_attr "type" "trap")])
+
+
+
+;;
+;;  ....................
+;;
+;;	ADDITION
+;;
+;;  ....................
+;;
+
+(define_insn "add<mode>3"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(plus:ANYF (match_operand:ANYF 1 "register_operand" "f")
+		   (match_operand:ANYF 2 "register_operand" "f")))]
+  ""
+  "fadd.<fmt>\t%0,%1,%2"
+  [(set_attr "type" "fadd")
+   (set_attr "mode" "<UNITMODE>")])
+
+(define_insn "add<mode>3"
+  [(set (match_operand:GPR 0 "register_operand" "=r,r")
+	(plus:GPR (match_operand:GPR 1 "register_operand" "r,r")
+		  (match_operand:GPR 2 "arith_operand" "r,I")))]
+  ""
+  "add%i2.<d>\t%0,%1,%2";
+  [(set_attr "alu_type" "add")
+   (set_attr "compression" "*,*")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "*addsi3_extended"
+  [(set (match_operand:DI 0 "register_operand" "=r,r")
+	(sign_extend:DI
+	     (plus:SI (match_operand:SI 1 "register_operand" "r,r")
+		      (match_operand:SI 2 "arith_operand" "r,I"))))]
+  "TARGET_64BIT"
+  "add%i2.w\t%0,%1,%2"
+  [(set_attr "alu_type" "add")
+   (set_attr "mode" "SI")])
+
+(define_insn "*addsi3_extended2"
+  [(set (match_operand:DI 0 "register_operand" "=r,r")
+	(sign_extend:DI
+	  (subreg:SI (plus:DI (match_operand:DI 1 "register_operand" "r,r")
+			      (match_operand:DI 2 "arith_operand"    "r,I"))
+		     0)))]
+  "TARGET_64BIT"
+  "add%i2.w\t%0,%1,%2"
+  [(set_attr "alu_type" "add")
+   (set_attr "mode" "SI")])
+
+
+;;
+;;  ....................
+;;
+;;	SUBTRACTION
+;;
+;;  ....................
+;;
+
+(define_insn "sub<mode>3"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(minus:ANYF (match_operand:ANYF 1 "register_operand" "f")
+		    (match_operand:ANYF 2 "register_operand" "f")))]
+  ""
+  "fsub.<fmt>\t%0,%1,%2"
+  [(set_attr "type" "fadd")
+   (set_attr "mode" "<UNITMODE>")])
+
+(define_insn "sub<mode>3"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(minus:GPR (match_operand:GPR 1 "register_operand" "rJ")
+		   (match_operand:GPR 2 "register_operand" "r")))]
+  ""
+  "sub.<d>\t%0,%z1,%2"
+  [(set_attr "alu_type" "sub")
+   (set_attr "compression" "*")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "*subsi3_extended"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(sign_extend:DI
+	    (minus:SI (match_operand:SI 1 "register_operand" "rJ")
+		      (match_operand:SI 2 "register_operand" "r"))))]
+  "TARGET_64BIT"
+  "sub.w\t%0,%z1,%2"
+  [(set_attr "alu_type" "sub")
+   (set_attr "mode" "DI")])
+
+(define_insn "*subsi3_extended2"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(sign_extend:DI
+	  (subreg:SI (minus:DI (match_operand:DI 1 "reg_or_0_operand" "rJ")
+			       (match_operand:DI 2 "register_operand" "r"))
+		     0)))]
+  "TARGET_64BIT"
+  "sub.w\t%0,%z1,%2"
+  [(set_attr "alu_type" "sub")
+   (set_attr "mode" "SI")])
+
+
+;;
+;;  ....................
+;;
+;;	MULTIPLICATION
+;;
+;;  ....................
+;;
+
+(define_insn "mul<mode>3"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(mult:ANYF (match_operand:ANYF 1 "register_operand" "f")
+		   (match_operand:ANYF 2 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "fmul.<fmt>\t%0,%1,%2"
+  [(set_attr "type" "fmul")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "mul<mode>3"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(mult:GPR (match_operand:GPR 1 "register_operand" "r")
+		  (match_operand:GPR 2 "register_operand" "r")))]
+  ""
+  "mul.<d>\t%0,%1,%2"
+  [(set_attr "type" "imul")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "mulsidi3_64bit"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(mult:DI (sign_extend:DI (match_operand:SI 1 "register_operand" "r"))
+		 (sign_extend:DI (match_operand:SI 2 "register_operand" "r"))))]
+  "TARGET_64BIT"
+  "mul.d\t%0,%1,%2"
+  [(set_attr "type" "imul")
+   (set_attr "mode" "DI")])
+
+(define_insn "*mulsi3_extended"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(sign_extend:DI
+	    (mult:SI (match_operand:SI 1 "register_operand" "r")
+		     (match_operand:SI 2 "register_operand" "r"))))]
+  "TARGET_64BIT"
+  "mul.w\t%0,%1,%2"
+  [(set_attr "type" "imul")
+   (set_attr "mode" "SI")])
+
+(define_insn "*mulsi3_extended2"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(sign_extend:DI
+	  (subreg:SI (mult:DI (match_operand:DI 1 "register_operand" "r")
+			      (match_operand:DI 2 "register_operand" "r"))
+		     0)))]
+  "TARGET_64BIT"
+  "mul.w\t%0,%1,%2"
+  [(set_attr "type" "imul")
+   (set_attr "mode" "SI")])
+
+
+;;
+;;  ........................
+;;
+;;	MULTIPLICATION HIGH-PART
+;;
+;;  ........................
+;;
+
+
+(define_expand "<u>mulditi3"
+  [(set (match_operand:TI 0 "register_operand")
+	(mult:TI (any_extend:TI (match_operand:DI 1 "register_operand"))
+		 (any_extend:TI (match_operand:DI 2 "register_operand"))))]
+  "TARGET_64BIT"
+{
+  rtx low = gen_reg_rtx (DImode);
+  emit_insn (gen_muldi3 (low, operands[1], operands[2]));
+
+  rtx high = gen_reg_rtx (DImode);
+  emit_insn (gen_<u>muldi3_highpart (high, operands[1], operands[2]));
+
+  emit_move_insn (gen_lowpart (DImode, operands[0]), low);
+  emit_move_insn (gen_highpart (DImode, operands[0]), high);
+  DONE;
+})
+
+(define_insn "<u>muldi3_highpart"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(truncate:DI
+	  (lshiftrt:TI
+	    (mult:TI (any_extend:TI
+		       (match_operand:DI 1 "register_operand" " r"))
+		     (any_extend:TI
+		       (match_operand:DI 2 "register_operand" " r")))
+	    (const_int 64))))]
+  "TARGET_64BIT"
+  "mulh.d<u>\t%0,%1,%2"
+  [(set_attr "type" "imul")
+   (set_attr "mode" "DI")])
+
+(define_expand "<u>mulsidi3"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(mult:DI (any_extend:DI
+		   (match_operand:SI 1 "register_operand" " r"))
+		 (any_extend:DI
+		   (match_operand:SI 2 "register_operand" " r"))))]
+  "!TARGET_64BIT"
+{
+  rtx temp = gen_reg_rtx (SImode);
+  emit_insn (gen_mulsi3 (temp, operands[1], operands[2]));
+  emit_insn (gen_<u>mulsi3_highpart (loongarch_subword (operands[0], true),
+				     operands[1], operands[2]));
+  emit_insn (gen_movsi (loongarch_subword (operands[0], false), temp));
+  DONE;
+})
+
+(define_insn "<u>mulsi3_highpart"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(truncate:SI
+	  (lshiftrt:DI
+	    (mult:DI (any_extend:DI
+		       (match_operand:SI 1 "register_operand" " r"))
+		     (any_extend:DI
+		       (match_operand:SI 2 "register_operand" " r")))
+	    (const_int 32))))]
+  "!TARGET_64BIT"
+  "mulh.w<u>\t%0,%1,%2"
+  [(set_attr "type" "imul")
+   (set_attr "mode" "SI")])
+
+;;
+;;  ....................
+;;
+;;	DIVISION and REMAINDER
+;;
+;;  ....................
+;;
+
+;; Float division and modulus.
+(define_expand "div<mode>3"
+  [(set (match_operand:ANYF 0 "register_operand")
+	(div:ANYF (match_operand:ANYF 1 "reg_or_1_operand")
+		  (match_operand:ANYF 2 "register_operand")))]
+  "TARGET_HARD_FLOAT"
+{})
+
+(define_insn "*div<mode>3"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(div:ANYF (match_operand:ANYF 1 "register_operand" "f")
+		  (match_operand:ANYF 2 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "fdiv.<fmt>\t%0,%1,%2"
+  [(set_attr "type" "fdiv")
+   (set_attr "mode" "<UNITMODE>")
+   (set_attr "insn_count" "1")])
+
+;; In 3A5000, the reciprocal operation is the same as the division operation.
+
+(define_insn "*recip<mode>3"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(div:ANYF (match_operand:ANYF 1 "const_1_operand" "")
+		  (match_operand:ANYF 2 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "frecip.<fmt>\t%0,%2"
+  [(set_attr "type" "frdiv")
+   (set_attr "mode" "<UNITMODE>")
+   (set_attr "insn_count" "1")])
+
+;; Integer division and modulus.
+
+(define_insn "<optab><mode>3"
+  [(set (match_operand:GPR 0 "register_operand" "=&r")
+	(any_div:GPR (match_operand:GPR 1 "register_operand" "r")
+		     (match_operand:GPR 2 "register_operand" "r")))]
+  ""
+  {
+    return loongarch_output_division ("<insn>.<d><u>\t%0,%1,%2", operands);
+  }
+  [(set_attr "type" "idiv")
+   (set_attr "mode" "<MODE>")])
+
+
+;; Floating point multiply accumulate instructions.
+
+;; a * b + c
+(define_expand "fma<mode>4"
+  [(set (match_operand:ANYF 0 "register_operand")
+	(fma:ANYF (match_operand:ANYF 1 "register_operand")
+		  (match_operand:ANYF 2 "register_operand")
+		  (match_operand:ANYF 3 "register_operand")))]
+  "TARGET_HARD_FLOAT")
+
+(define_insn "*fma<mode>4_madd4"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(fma:ANYF (match_operand:ANYF 1 "register_operand" "f")
+		  (match_operand:ANYF 2 "register_operand" "f")
+		  (match_operand:ANYF 3 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "fmadd.<fmt>\t%0,%1,%2,%3"
+  [(set_attr "type" "fmadd")
+   (set_attr "mode" "<UNITMODE>")])
+
+;; a * b - c
+(define_insn "fms<mode>4"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(fma:ANYF (match_operand:ANYF 1 "register_operand" "f")
+		  (match_operand:ANYF 2 "register_operand" "f")
+		  (neg:ANYF (match_operand:ANYF 3 "register_operand" "f"))))]
+  "TARGET_HARD_FLOAT"
+  "fmsub.<fmt>\t%0,%1,%2,%3"
+  [(set_attr "type" "fmadd")
+   (set_attr "mode" "<UNITMODE>")])
+
+;; fnma is defined in GCC as (fma (neg op1) op2 op3)
+;; (-op1 * op2) + op3 ==> -(op1 * op2) + op3 ==> -((op1 * op2) - op3)
+;; The loongarch nmsub instructions implement -((op1 * op2) - op3)
+;; This transformation means we may return the wrong signed zero
+;; so we check HONOR_SIGNED_ZEROS.
+
+;; -a * b + c
+(define_insn "fnma<mode>4"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(fma:ANYF (neg:ANYF (match_operand:ANYF 1 "register_operand" "f"))
+		  (match_operand:ANYF 2 "register_operand" "f")
+		  (match_operand:ANYF 3 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT && !HONOR_SIGNED_ZEROS (<MODE>mode)"
+  "fnmsub.<fmt>\t%0,%1,%2,%3"
+  [(set_attr "type" "fmadd")
+   (set_attr "mode" "<UNITMODE>")])
+
+;; fnms is defined as: (fma (neg op1) op2 (neg op3))
+;; ((-op1) * op2) - op3 ==> -(op1 * op2) - op3 ==> -((op1 * op2) + op3)
+;; The loongarch nmadd instructions implement -((op1 * op2) + op3)
+;; This transformation means we may return the wrong signed zero
+;; so we check HONOR_SIGNED_ZEROS.
+
+;; -a * b - c
+(define_insn "fnms<mode>4"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(fma:ANYF
+	  (neg:ANYF (match_operand:ANYF 1 "register_operand" "f"))
+	  (match_operand:ANYF 2 "register_operand" "f")
+	  (neg:ANYF (match_operand:ANYF 3 "register_operand" "f"))))]
+  "TARGET_HARD_FLOAT && !HONOR_SIGNED_ZEROS (<MODE>mode)"
+  "fnmadd.<fmt>\t%0,%1,%2,%3"
+  [(set_attr "type" "fmadd")
+   (set_attr "mode" "<UNITMODE>")])
+
+;; -(-a * b - c), modulo signed zeros
+(define_insn "*fma<mode>4"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(neg:ANYF
+	    (fma:ANYF
+		(neg:ANYF (match_operand:ANYF 1 "register_operand" " f"))
+		(match_operand:ANYF 2 "register_operand" " f")
+		(neg:ANYF (match_operand:ANYF 3 "register_operand" " f")))))]
+  "TARGET_HARD_FLOAT && !HONOR_SIGNED_ZEROS (<MODE>mode)"
+  "fmadd.<fmt>\t%0,%1,%2,%3"
+  [(set_attr "type" "fmadd")
+   (set_attr "mode" "<UNITMODE>")])
+
+;; -(-a * b + c), modulo signed zeros
+(define_insn "*fms<mode>4"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(neg:ANYF
+	    (fma:ANYF
+		(neg:ANYF (match_operand:ANYF 1 "register_operand" " f"))
+		(match_operand:ANYF 2 "register_operand" " f")
+		(match_operand:ANYF 3 "register_operand" " f"))))]
+  "TARGET_HARD_FLOAT && !HONOR_SIGNED_ZEROS (<MODE>mode)"
+  "fmsub.<fmt>\t%0,%1,%2,%3"
+  [(set_attr "type" "fmadd")
+   (set_attr "mode" "<UNITMODE>")])
+
+;; -(a * b + c)
+(define_insn "*fnms<mode>4"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(neg:ANYF
+	    (fma:ANYF
+		(match_operand:ANYF 1 "register_operand" " f")
+		(match_operand:ANYF 2 "register_operand" " f")
+		(match_operand:ANYF 3 "register_operand" " f"))))]
+  "TARGET_HARD_FLOAT"
+  "fnmadd.<fmt>\t%0,%1,%2,%3"
+  [(set_attr "type" "fmadd")
+   (set_attr "mode" "<UNITMODE>")])
+
+;; -(a * b - c)
+(define_insn "*fnma<mode>4"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(neg:ANYF
+	    (fma:ANYF
+		(match_operand:ANYF 1 "register_operand" " f")
+		(match_operand:ANYF 2 "register_operand" " f")
+		(neg:ANYF (match_operand:ANYF 3 "register_operand" " f")))))]
+  "TARGET_HARD_FLOAT"
+  "fnmsub.<fmt>\t%0,%1,%2,%3"
+  [(set_attr "type" "fmadd")
+   (set_attr "mode" "<UNITMODE>")])
+
+;;
+;;  ....................
+;;
+;;	SQUARE ROOT
+;;
+;;  ....................
+
+(define_insn "sqrt<mode>2"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(sqrt:ANYF (match_operand:ANYF 1 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "fsqrt.<fmt>\t%0,%1"
+  [(set_attr "type" "fsqrt")
+   (set_attr "mode" "<UNITMODE>")
+   (set_attr "insn_count" "1")])
+
+(define_insn "*rsqrt<mode>a"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(div:ANYF (match_operand:ANYF 1 "const_1_operand" "")
+		  (sqrt:ANYF (match_operand:ANYF 2 "register_operand" "f"))))]
+  "TARGET_HARD_FLOAT && flag_unsafe_math_optimizations"
+  "frsqrt.<fmt>\t%0,%2"
+  [(set_attr "type" "frsqrt")
+   (set_attr "mode" "<UNITMODE>")
+   (set_attr "insn_count" "1")])
+
+(define_insn "*rsqrt<mode>b"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(sqrt:ANYF (div:ANYF (match_operand:ANYF 1 "const_1_operand" "")
+			     (match_operand:ANYF 2 "register_operand" "f"))))]
+  "TARGET_HARD_FLOAT && flag_unsafe_math_optimizations"
+  "frsqrt.<fmt>\t%0,%2"
+  [(set_attr "type" "frsqrt")
+   (set_attr "mode" "<UNITMODE>")
+   (set_attr "insn_count" "1")])
+
+;;
+;;  ....................
+;;
+;;	ABSOLUTE VALUE
+;;
+;;  ....................
+
+(define_insn "abs<mode>2"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(abs:ANYF (match_operand:ANYF 1 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "fabs.<fmt>\t%0,%1"
+  [(set_attr "type" "fabs")
+   (set_attr "mode" "<UNITMODE>")])
+
+;;
+;;  ...................
+;;
+;;  Count leading zeroes.
+;;
+;;  ...................
+;;
+
+(define_insn "clz<mode>2"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(clz:GPR (match_operand:GPR 1 "register_operand" "r")))]
+  ""
+  "clz.<d>\t%0,%1"
+  [(set_attr "type" "clz")
+   (set_attr "mode" "<MODE>")])
+
+;;
+;;  ...................
+;;
+;;  Count trailing zeroes.
+;;
+;;  ...................
+;;
+
+(define_insn "ctz<mode>2"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(ctz:GPR (match_operand:GPR 1 "register_operand" "r")))]
+  ""
+  "ctz.<d>\t%0,%1"
+  [(set_attr "type" "clz")
+   (set_attr "mode" "<MODE>")])
+
+;;
+;;  ....................
+;;
+;;	MIN/MAX
+;;
+;;  ....................
+
+(define_insn "smax<mode>3"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+       (smax:ANYF (match_operand:ANYF 1 "register_operand" "f")
+		  (match_operand:ANYF 2 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "fmax.<fmt>\t%0,%1,%2"
+  [(set_attr "type" "fmove")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "smin<mode>3"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+       (smin:ANYF (match_operand:ANYF 1 "register_operand" "f")
+		  (match_operand:ANYF 2 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "fmin.<fmt>\t%0,%1,%2"
+  [(set_attr "type" "fmove")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "smaxa<mode>3"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+       (if_then_else:ANYF
+	      (gt (abs:ANYF (match_operand:ANYF 1 "register_operand" "f"))
+		  (abs:ANYF (match_operand:ANYF 2 "register_operand" "f")))
+	      (match_dup 1)
+	      (match_dup 2)))]
+  "TARGET_HARD_FLOAT"
+  "fmaxa.<fmt>\t%0,%1,%2"
+  [(set_attr "type" "fmove")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "smina<mode>3"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+       (if_then_else:ANYF
+		(lt (abs:ANYF (match_operand:ANYF 1 "register_operand" "f"))
+		    (abs:ANYF (match_operand:ANYF 2 "register_operand" "f")))
+		(match_dup 1)
+		(match_dup 2)))]
+  "TARGET_HARD_FLOAT"
+  "fmina.<fmt>\t%0,%1,%2"
+  [(set_attr "type" "fmove")
+   (set_attr "mode" "<MODE>")])
+
+;;
+;;  ....................
+;;
+;;	NEGATION and ONE'S COMPLEMENT
+;;
+;;  ....................
+
+(define_insn "neg<mode>2"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(neg:GPR (match_operand:GPR 1 "register_operand" "r")))]
+  ""
+  "sub.<d>\t%0,%.,%1"
+  [(set_attr "alu_type"	"sub")
+   (set_attr "mode"	"<MODE>")])
+
+(define_insn "one_cmpl<mode>2"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(not:GPR (match_operand:GPR 1 "register_operand" "r")))]
+  ""
+  "nor\t%0,%.,%1"
+  [(set_attr "alu_type" "not")
+   (set_attr "compression" "*")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "neg<mode>2"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(neg:ANYF (match_operand:ANYF 1 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "fneg.<fmt>\t%0,%1"
+  [(set_attr "type" "fneg")
+   (set_attr "mode" "<UNITMODE>")])
+
+
+;;
+;;  ....................
+;;
+;;	LOGICAL
+;;
+;;  ....................
+;;
+
+(define_insn "<optab><mode>3"
+  [(set (match_operand:GPR 0 "register_operand" "=r,r")
+	(any_bitwise:GPR (match_operand:GPR 1 "register_operand" "r,r")
+		 (match_operand:GPR 2 "uns_arith_operand" "r,K")))]
+  ""
+  "<insn>%i2\t%0,%1,%2"
+  [(set_attr "type" "logical")
+   (set_attr "compression" "*,*")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "and<mode>3_extended"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(and:GPR (match_operand:GPR 1 "nonimmediate_operand" "r")
+		 (match_operand:GPR 2 "low_bitmask_operand" "Yx")))]
+  ""
+{
+  int len;
+
+  len = low_bitmask_len (<MODE>mode, INTVAL (operands[2]));
+  operands[2] = GEN_INT (len-1);
+  return "bstrpick.<d>\t%0,%1,%2,0";
+}
+  [(set_attr "move_type" "pick_ins")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "*iorhi3"
+  [(set (match_operand:HI 0 "register_operand" "=r,r")
+	(ior:HI (match_operand:HI 1 "register_operand" "r,r")
+		(match_operand:HI 2 "uns_arith_operand" "r,K")))]
+  ""
+  "or%i2\t%0,%1,%2"
+  [(set_attr "type" "logical")
+   (set_attr "mode" "HI")])
+
+(define_insn "*nor<mode>3"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(and:GPR (not:GPR (match_operand:GPR 1 "register_operand" "r"))
+		 (not:GPR (match_operand:GPR 2 "register_operand" "r"))))]
+  ""
+  "nor\t%0,%1,%2"
+  [(set_attr "type" "logical")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "andn<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(and:GPR
+	  (not:GPR (match_operand:GPR 1 "register_operand" "r"))
+	  (match_operand:GPR 2 "register_operand" "r")))]
+  ""
+  "andn\t%0,%2,%1"
+  [(set_attr "type" "logical")])
+
+(define_insn "orn<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(ior:GPR
+	  (not:GPR (match_operand:GPR 1 "register_operand" "r"))
+	  (match_operand:GPR 2 "register_operand" "r")))]
+  ""
+  "orn\t%0,%2,%1"
+  [(set_attr "type" "logical")])
+
+
+;;
+;;  ....................
+;;
+;;	TRUNCATION
+;;
+;;  ....................
+
+(define_insn "truncdfsf2"
+  [(set (match_operand:SF 0 "register_operand" "=f")
+	(float_truncate:SF (match_operand:DF 1 "register_operand" "f")))]
+  "TARGET_DOUBLE_FLOAT"
+  "fcvt.s.d\t%0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "cnv_mode"	"D2S")
+   (set_attr "mode" "SF")])
+
+;; Integer truncation patterns.  Truncating SImode values to smaller
+;; modes is a no-op, as it is for most other GCC ports.  Truncating
+;; DImode values to SImode is not a no-op for TARGET_64BIT since we
+;; need to make sure that the lower 32 bits are properly sign-extended
+;; (see TARGET_TRULY_NOOP_TRUNCATION).  Truncating DImode values into modes
+;; smaller than SImode is equivalent to two separate truncations:
+;;
+;;			  A       B
+;;    DI ---> HI  ==  DI ---> SI ---> HI
+;;    DI ---> QI  ==  DI ---> SI ---> QI
+;;
+;; Step A needs a real instruction but step B does not.
+
+(define_insn "truncdi<mode>2"
+  [(set (match_operand:SUBDI 0 "nonimmediate_operand" "=r,m,k")
+	(truncate:SUBDI (match_operand:DI 1 "register_operand" "r,r,r")))]
+  "TARGET_64BIT"
+  "@
+    slli.w\t%0,%1,0
+    st.<size>\t%1,%0
+    stx.<size>\t%1,%0"
+  [(set_attr "move_type" "sll0,store,store")
+   (set_attr "mode" "SI")])
+
+(define_insn "truncdisi2_extended"
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=ZC")
+	(truncate:SI (match_operand:DI 1 "register_operand" "r")))]
+  "TARGET_64BIT"
+  "stptr.w\t%1,%0"
+  [(set_attr "move_type" "store")
+   (set_attr "mode" "SI")])
+
+;; Combiner patterns to optimize shift/truncate combinations.
+
+(define_insn "*ashr_trunc<mode>"
+  [(set (match_operand:SUBDI 0 "register_operand" "=r")
+	(truncate:SUBDI
+	  (ashiftrt:DI (match_operand:DI 1 "register_operand" "r")
+		       (match_operand:DI 2 "const_arith_operand" ""))))]
+  "TARGET_64BIT && IN_RANGE (INTVAL (operands[2]), 32, 63)"
+  "srai.d\t%0,%1,%2"
+  [(set_attr "type" "shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "*lshr32_trunc<mode>"
+  [(set (match_operand:SUBDI 0 "register_operand" "=r")
+	(truncate:SUBDI
+	  (lshiftrt:DI (match_operand:DI 1 "register_operand" "r")
+		       (const_int 32))))]
+  "TARGET_64BIT"
+  "srai.d\t%0,%1,32"
+  [(set_attr "type" "shift")
+   (set_attr "mode" "<MODE>")])
+
+;;
+;;  ....................
+;;
+;;	ZERO EXTENSION
+;;
+;;  ....................
+
+(define_insn "zero_extendsidi2"
+  [(set (match_operand:DI 0 "register_operand" "=r,r,r,r")
+	(zero_extend:DI (match_operand:SI 1 "nonimmediate_operand" "r,ZC,m,k")))]
+  "TARGET_64BIT"
+  "@
+   bstrpick.d\t%0,%1,31,0
+   ldptr.w\t%0,%1\n\tlu32i.d\t%0,0
+   ld.wu\t%0,%1
+   ldx.wu\t%0,%1"
+  [(set_attr "move_type" "arith,load,load,load")
+   (set_attr "mode" "DI")
+   (set_attr "insn_count" "1,2,1,1")])
+
+(define_insn "zero_extend<SHORT:mode><GPR:mode>2"
+  [(set (match_operand:GPR 0 "register_operand" "=r,r,r")
+	(zero_extend:GPR
+	     (match_operand:SHORT 1 "nonimmediate_operand" "r,m,k")))]
+  ""
+  "@
+   bstrpick.w\t%0,%1,<SHORT:qi_hi>,0
+   ld.<SHORT:size>u\t%0,%1
+   ldx.<SHORT:size>u\t%0,%1"
+  [(set_attr "move_type" "pick_ins,load,load")
+   (set_attr "compression" "*,*,*")
+   (set_attr "mode" "<GPR:MODE>")])
+
+(define_insn "zero_extendqihi2"
+  [(set (match_operand:HI 0 "register_operand" "=r,r,r")
+	(zero_extend:HI (match_operand:QI 1 "nonimmediate_operand" "r,m,k")))]
+  ""
+  "@
+   andi\t%0,%1,0xff
+   ld.bu\t%0,%1
+   ldx.bu\t%0,%1"
+  [(set_attr "move_type" "andi,load,load")
+   (set_attr "mode" "HI")])
+
+;; Combiner patterns to optimize truncate/zero_extend combinations.
+
+(define_insn "*zero_extend<GPR:mode>_trunc<SHORT:mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(zero_extend:GPR
+	    (truncate:SHORT (match_operand:DI 1 "register_operand" "r"))))]
+  "TARGET_64BIT"
+  "bstrpick.w\t%0,%1,<SHORT:qi_hi>,0"
+  [(set_attr "move_type" "pick_ins")
+   (set_attr "mode" "<GPR:MODE>")])
+
+(define_insn "*zero_extendhi_truncqi"
+  [(set (match_operand:HI 0 "register_operand" "=r")
+	(zero_extend:HI
+	    (truncate:QI (match_operand:DI 1 "register_operand" "r"))))]
+  "TARGET_64BIT"
+  "andi\t%0,%1,0xff"
+  [(set_attr "alu_type" "and")
+   (set_attr "mode" "HI")])
+
+;;
+;;  ....................
+;;
+;;	SIGN EXTENSION
+;;
+;;  ....................
+
+;; Extension insns.
+;; Those for integer source operand are ordered widest source type first.
+
+;; When TARGET_64BIT, all SImode integer should already be in sign-extended
+;; form (see TARGET_TRULY_NOOP_TRUNCATION and truncdisi2).  We can therefore
+;; get rid of register->register instructions if we constrain the source to
+;; be in the same register as the destination.
+;;
+;; Only the pre-reload scheduler sees the type of the register alternatives;
+;; we split them into nothing before the post-reload scheduler runs.
+;; These alternatives therefore have type "move" in order to reflect
+;; what happens if the two pre-reload operands cannot be tied, and are
+;; instead allocated two separate GPRs.
+(define_insn_and_split "extendsidi2"
+  [(set (match_operand:DI 0 "register_operand" "=r,r,r,r")
+	(sign_extend:DI (match_operand:SI 1 "nonimmediate_operand" "0,ZC,m,k")))]
+  "TARGET_64BIT"
+  "@
+   #
+   ldptr.w\t%0,%1
+   ld.w\t%0,%1
+   ldx.w\t%0,%1"
+  "&& reload_completed && register_operand (operands[1], VOIDmode)"
+  [(const_int 0)]
+{
+  emit_note (NOTE_INSN_DELETED);
+  DONE;
+}
+  [(set_attr "move_type" "move,load,load,load")
+   (set_attr "mode" "DI")])
+
+(define_insn "extend<SHORT:mode><GPR:mode>2"
+  [(set (match_operand:GPR 0 "register_operand" "=r,r,r")
+	(sign_extend:GPR
+	     (match_operand:SHORT 1 "nonimmediate_operand" "r,m,k")))]
+  ""
+  "@
+   ext.w.<SHORT:size>\t%0,%1
+   ld.<SHORT:size>\t%0,%1
+   ldx.<SHORT:size>\t%0,%1"
+  [(set_attr "move_type" "signext,load,load")
+   (set_attr "mode" "<GPR:MODE>")])
+
+(define_insn "extendqihi2"
+  [(set (match_operand:HI 0 "register_operand" "=r,r,r")
+	(sign_extend:HI
+	     (match_operand:QI 1 "nonimmediate_operand" "r,m,k")))]
+  ""
+  "@
+   ext.w.b\t%0,%1
+   ld.b\t%0,%1
+   ldx.b\t%0,%1"
+  [(set_attr "move_type" "signext,load,load")
+   (set_attr "mode" "SI")])
+
+(define_insn "*extenddi_truncate<mode>"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(sign_extend:DI
+	    (truncate:SHORT (match_operand:DI 1 "register_operand" "r"))))]
+  "TARGET_64BIT"
+  "ext.w.<size>\t%0,%1"
+  [(set_attr "move_type" "signext")
+   (set_attr "mode" "DI")])
+
+(define_insn "*extendsi_truncate<mode>"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(sign_extend:SI
+	    (truncate:SHORT (match_operand:DI 1 "register_operand" "r"))))]
+  "TARGET_64BIT"
+  "ext.w.<size>\t%0,%1"
+  [(set_attr "move_type" "signext")
+   (set_attr "mode" "SI")])
+
+(define_insn "*extendhi_truncateqi"
+  [(set (match_operand:HI 0 "register_operand" "=r")
+	(sign_extend:HI
+	    (truncate:QI (match_operand:DI 1 "register_operand" "r"))))]
+  "TARGET_64BIT"
+  "ext.w.b\t%0,%1"
+  [(set_attr "move_type" "signext")
+   (set_attr "mode" "SI")])
+
+(define_insn "extendsfdf2"
+  [(set (match_operand:DF 0 "register_operand" "=f")
+	(float_extend:DF (match_operand:SF 1 "register_operand" "f")))]
+  "TARGET_DOUBLE_FLOAT"
+  "fcvt.d.s\t%0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "cnv_mode"	"S2D")
+   (set_attr "mode" "DF")])
+
+;;
+;;  ....................
+;;
+;;	CONVERSIONS
+;;
+;;  ....................
+
+;; conversion of a floating-point value to a integer
+
+(define_insn "fix_truncdfsi2"
+  [(set (match_operand:SI 0 "register_operand" "=f")
+	(fix:SI (match_operand:DF 1 "register_operand" "f")))]
+  "TARGET_DOUBLE_FLOAT"
+  "ftintrz.w.d %0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "DF")
+   (set_attr "cnv_mode"	"D2I")])
+
+(define_insn "fix_truncsfsi2"
+  [(set (match_operand:SI 0 "register_operand" "=f")
+	(fix:SI (match_operand:SF 1 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "ftintrz.w.s %0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "SF")
+   (set_attr "cnv_mode"	"S2I")])
+
+
+(define_insn "fix_truncdfdi2"
+  [(set (match_operand:DI 0 "register_operand" "=f")
+	(fix:DI (match_operand:DF 1 "register_operand" "f")))]
+  "TARGET_DOUBLE_FLOAT"
+  "ftintrz.l.d %0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "DF")
+   (set_attr "cnv_mode"	"D2I")])
+
+
+(define_insn "fix_truncsfdi2"
+  [(set (match_operand:DI 0 "register_operand" "=f")
+	(fix:DI (match_operand:SF 1 "register_operand" "f")))]
+  "TARGET_DOUBLE_FLOAT"
+  "ftintrz.l.s %0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "SF")
+   (set_attr "cnv_mode"	"S2I")])
+
+;; conversion of an integeral (or boolean) value to a floating-point value
+
+(define_insn "floatsidf2"
+  [(set (match_operand:DF 0 "register_operand" "=f")
+	(float:DF (match_operand:SI 1 "register_operand" "f")))]
+  "TARGET_DOUBLE_FLOAT"
+  "ffint.d.w\t%0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "DF")
+   (set_attr "cnv_mode"	"I2D")])
+
+(define_insn "floatdidf2"
+  [(set (match_operand:DF 0 "register_operand" "=f")
+	(float:DF (match_operand:DI 1 "register_operand" "f")))]
+  "TARGET_DOUBLE_FLOAT"
+  "ffint.d.l\t%0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "DF")
+   (set_attr "cnv_mode" "I2D")])
+
+(define_insn "floatsisf2"
+  [(set (match_operand:SF 0 "register_operand" "=f")
+	(float:SF (match_operand:SI 1 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "ffint.s.w\t%0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "SF")
+   (set_attr "cnv_mode"	"I2S")])
+
+(define_insn "floatdisf2"
+  [(set (match_operand:SF 0 "register_operand" "=f")
+	(float:SF (match_operand:DI 1 "register_operand" "f")))]
+  "TARGET_DOUBLE_FLOAT"
+  "ffint.s.l\t%0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "SF")
+   (set_attr "cnv_mode"	"I2S")])
+
+;; floating point value by converting to value to an unsigned integer
+
+(define_expand "fixuns_truncdfsi2"
+  [(set (match_operand:SI 0 "register_operand")
+	(unsigned_fix:SI (match_operand:DF 1 "register_operand")))]
+  "TARGET_DOUBLE_FLOAT"
+{
+  rtx reg1 = gen_reg_rtx (DFmode);
+  rtx reg2 = gen_reg_rtx (DFmode);
+  rtx reg3 = gen_reg_rtx (SImode);
+  rtx_code_label *label1 = gen_label_rtx ();
+  rtx_code_label *label2 = gen_label_rtx ();
+  rtx test;
+  REAL_VALUE_TYPE offset;
+
+  real_2expN (&offset, 31, DFmode);
+
+  if (reg1)		      /* Turn off complaints about unreached code.  */
+    {
+      loongarch_emit_move (reg1,
+			   const_double_from_real_value (offset, DFmode));
+      do_pending_stack_adjust ();
+
+      test = gen_rtx_GE (VOIDmode, operands[1], reg1);
+      emit_jump_insn (gen_cbranchdf4 (test, operands[1], reg1, label1));
+
+      emit_insn (gen_fix_truncdfsi2 (operands[0], operands[1]));
+      emit_jump_insn (gen_rtx_SET (pc_rtx,
+				   gen_rtx_LABEL_REF (VOIDmode, label2)));
+      emit_barrier ();
+
+      emit_label (label1);
+      loongarch_emit_move (reg2, gen_rtx_MINUS (DFmode, operands[1], reg1));
+      loongarch_emit_move (reg3, GEN_INT (trunc_int_for_mode
+				     (BITMASK_HIGH, SImode)));
+
+      emit_insn (gen_fix_truncdfsi2 (operands[0], reg2));
+      emit_insn (gen_iorsi3 (operands[0], operands[0], reg3));
+
+      emit_label (label2);
+
+      /* Allow REG_NOTES to be set on last insn (labels don't have enough
+	 fields, and can't be used for REG_NOTES anyway).  */
+      emit_use (stack_pointer_rtx);
+      DONE;
+    }
+})
+
+(define_expand "fixuns_truncdfdi2"
+  [(set (match_operand:DI 0 "register_operand")
+	(unsigned_fix:DI (match_operand:DF 1 "register_operand")))]
+  "TARGET_DOUBLE_FLOAT"
+{
+  rtx reg1 = gen_reg_rtx (DFmode);
+  rtx reg2 = gen_reg_rtx (DFmode);
+  rtx reg3 = gen_reg_rtx (DImode);
+  rtx_code_label *label1 = gen_label_rtx ();
+  rtx_code_label *label2 = gen_label_rtx ();
+  rtx test;
+  REAL_VALUE_TYPE offset;
+
+  real_2expN (&offset, 63, DFmode);
+
+  loongarch_emit_move (reg1, const_double_from_real_value (offset, DFmode));
+  do_pending_stack_adjust ();
+
+  test = gen_rtx_GE (VOIDmode, operands[1], reg1);
+  emit_jump_insn (gen_cbranchdf4 (test, operands[1], reg1, label1));
+
+  emit_insn (gen_fix_truncdfdi2 (operands[0], operands[1]));
+  emit_jump_insn (gen_rtx_SET (pc_rtx, gen_rtx_LABEL_REF (VOIDmode, label2)));
+  emit_barrier ();
+
+  emit_label (label1);
+  loongarch_emit_move (reg2, gen_rtx_MINUS (DFmode, operands[1], reg1));
+  loongarch_emit_move (reg3, GEN_INT (BITMASK_HIGH));
+  emit_insn (gen_ashldi3 (reg3, reg3, GEN_INT (32)));
+
+  emit_insn (gen_fix_truncdfdi2 (operands[0], reg2));
+  emit_insn (gen_iordi3 (operands[0], operands[0], reg3));
+
+  emit_label (label2);
+
+  /* Allow REG_NOTES to be set on last insn (labels don't have enough
+     fields, and can't be used for REG_NOTES anyway).  */
+  emit_use (stack_pointer_rtx);
+  DONE;
+})
+
+(define_expand "fixuns_truncsfsi2"
+  [(set (match_operand:SI 0 "register_operand")
+	(unsigned_fix:SI (match_operand:SF 1 "register_operand")))]
+  "TARGET_HARD_FLOAT"
+{
+  rtx reg1 = gen_reg_rtx (SFmode);
+  rtx reg2 = gen_reg_rtx (SFmode);
+  rtx reg3 = gen_reg_rtx (SImode);
+  rtx_code_label *label1 = gen_label_rtx ();
+  rtx_code_label *label2 = gen_label_rtx ();
+  rtx test;
+  REAL_VALUE_TYPE offset;
+
+  real_2expN (&offset, 31, SFmode);
+
+  loongarch_emit_move (reg1, const_double_from_real_value (offset, SFmode));
+  do_pending_stack_adjust ();
+
+  test = gen_rtx_GE (VOIDmode, operands[1], reg1);
+  emit_jump_insn (gen_cbranchsf4 (test, operands[1], reg1, label1));
+
+  emit_insn (gen_fix_truncsfsi2 (operands[0], operands[1]));
+  emit_jump_insn (gen_rtx_SET (pc_rtx, gen_rtx_LABEL_REF (VOIDmode, label2)));
+  emit_barrier ();
+
+  emit_label (label1);
+  loongarch_emit_move (reg2, gen_rtx_MINUS (SFmode, operands[1], reg1));
+  loongarch_emit_move (reg3, GEN_INT (trunc_int_for_mode
+				 (BITMASK_HIGH, SImode)));
+
+  emit_insn (gen_fix_truncsfsi2 (operands[0], reg2));
+  emit_insn (gen_iorsi3 (operands[0], operands[0], reg3));
+
+  emit_label (label2);
+
+  /* Allow REG_NOTES to be set on last insn (labels don't have enough
+     fields, and can't be used for REG_NOTES anyway).  */
+  emit_use (stack_pointer_rtx);
+  DONE;
+})
+
+(define_expand "fixuns_truncsfdi2"
+  [(set (match_operand:DI 0 "register_operand")
+	(unsigned_fix:DI (match_operand:SF 1 "register_operand")))]
+  "TARGET_DOUBLE_FLOAT"
+{
+  rtx reg1 = gen_reg_rtx (SFmode);
+  rtx reg2 = gen_reg_rtx (SFmode);
+  rtx reg3 = gen_reg_rtx (DImode);
+  rtx_code_label *label1 = gen_label_rtx ();
+  rtx_code_label *label2 = gen_label_rtx ();
+  rtx test;
+  REAL_VALUE_TYPE offset;
+
+  real_2expN (&offset, 63, SFmode);
+
+  loongarch_emit_move (reg1, const_double_from_real_value (offset, SFmode));
+  do_pending_stack_adjust ();
+
+  test = gen_rtx_GE (VOIDmode, operands[1], reg1);
+  emit_jump_insn (gen_cbranchsf4 (test, operands[1], reg1, label1));
+
+  emit_insn (gen_fix_truncsfdi2 (operands[0], operands[1]));
+  emit_jump_insn (gen_rtx_SET (pc_rtx, gen_rtx_LABEL_REF (VOIDmode, label2)));
+  emit_barrier ();
+
+  emit_label (label1);
+  loongarch_emit_move (reg2, gen_rtx_MINUS (SFmode, operands[1], reg1));
+  loongarch_emit_move (reg3, GEN_INT (BITMASK_HIGH));
+  emit_insn (gen_ashldi3 (reg3, reg3, GEN_INT (32)));
+
+  emit_insn (gen_fix_truncsfdi2 (operands[0], reg2));
+  emit_insn (gen_iordi3 (operands[0], operands[0], reg3));
+
+  emit_label (label2);
+
+  /* Allow REG_NOTES to be set on last insn (labels don't have enough
+     fields, and can't be used for REG_NOTES anyway).  */
+  emit_use (stack_pointer_rtx);
+  DONE;
+})
+
+;;
+;;  ....................
+;;
+;;	EXTRACT AND INSERT
+;;
+;;  ....................
+
+(define_expand "extzv<mode>"
+  [(set (match_operand:GPR 0 "register_operand")
+	(zero_extract:GPR (match_operand:GPR 1 "register_operand")
+			  (match_operand 2 "const_int_operand")
+			  (match_operand 3 "const_int_operand")))]
+  ""
+{
+  if (!loongarch_use_ins_ext_p (operands[1], INTVAL (operands[2]),
+			   INTVAL (operands[3])))
+    FAIL;
+})
+
+(define_insn "*extzv<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(zero_extract:GPR (match_operand:GPR 1 "register_operand" "r")
+			  (match_operand 2 "const_int_operand" "")
+			  (match_operand 3 "const_int_operand" "")))]
+  "loongarch_use_ins_ext_p (operands[1], INTVAL (operands[2]),
+		       INTVAL (operands[3]))"
+{
+  operands[2] = GEN_INT (INTVAL (operands[2]) + INTVAL (operands[3]) - 1);
+  return "bstrpick.<d>\t%0,%1,%2,%3";
+}
+  [(set_attr "type" "arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_expand "insv<mode>"
+  [(set (zero_extract:GPR (match_operand:GPR 0 "register_operand")
+			  (match_operand 1 "const_int_operand")
+			  (match_operand 2 "const_int_operand"))
+	(match_operand:GPR 3 "reg_or_0_operand"))]
+  ""
+{
+  if (!loongarch_use_ins_ext_p (operands[0], INTVAL (operands[1]),
+			   INTVAL (operands[2])))
+    FAIL;
+})
+
+(define_insn "*insv<mode>"
+  [(set (zero_extract:GPR (match_operand:GPR 0 "register_operand" "+r")
+			  (match_operand:SI 1 "const_int_operand" "")
+			  (match_operand:SI 2 "const_int_operand" ""))
+	(match_operand:GPR 3 "reg_or_0_operand" "rJ"))]
+  "loongarch_use_ins_ext_p (operands[0], INTVAL (operands[1]),
+		       INTVAL (operands[2]))"
+{
+  operands[1] = GEN_INT (INTVAL (operands[1]) + INTVAL (operands[2]) - 1);
+  return "bstrins.<d>\t%0,%z3,%1,%2";
+}
+  [(set_attr "type" "arith")
+   (set_attr "mode" "<MODE>")])
+
+;;
+;;  ....................
+;;
+;;	DATA MOVEMENT
+;;
+;;  ....................
+
+;; Allow combine to split complex const_int load sequences, using operand 2
+;; to store the intermediate results.  See move_operand for details.
+(define_split
+  [(set (match_operand:GPR 0 "register_operand")
+	(match_operand:GPR 1 "splittable_const_int_operand"))
+   (clobber (match_operand:GPR 2 "register_operand"))]
+  ""
+  [(const_int 0)]
+{
+  loongarch_move_integer (operands[2], operands[0], INTVAL (operands[1]));
+  DONE;
+})
+
+;; 64-bit integer moves
+
+;; Unlike most other insns, the move insns can't be split with
+;; different predicates, because register spilling and other parts of
+;; the compiler, have memoized the insn number already.
+
+(define_expand "movdi"
+  [(set (match_operand:DI 0 "")
+	(match_operand:DI 1 ""))]
+  ""
+{
+  if (loongarch_legitimize_move (DImode, operands[0], operands[1]))
+    DONE;
+})
+
+(define_insn "*movdi_32bit"
+  [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,r,w,*f,*f,*r,*m")
+       (match_operand:DI 1 "move_operand" "r,i,w,r,*J*r,*m,*f,*f"))]
+  "!TARGET_64BIT
+   && (register_operand (operands[0], DImode)
+       || reg_or_0_operand (operands[1], DImode))"
+  { return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "move_type" "move,const,load,store,mgtf,fpload,mftg,fpstore")
+   (set_attr "mode" "DI")])
+
+(define_insn "*movdi_64bit"
+  [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,r,w,*f,*f,*r,*m")
+	(match_operand:DI 1 "move_operand" "r,Yd,w,rJ,*r*J,*m,*f,*f"))]
+  "TARGET_64BIT
+   && (register_operand (operands[0], DImode)
+       || reg_or_0_operand (operands[1], DImode))"
+  { return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "move_type" "move,const,load,store,mgtf,fpload,mftg,fpstore")
+   (set_attr "mode" "DI")])
+
+;; 32-bit Integer moves
+
+(define_expand "movsi"
+  [(set (match_operand:SI 0 "")
+	(match_operand:SI 1 ""))]
+  ""
+{
+  if (loongarch_legitimize_move (SImode, operands[0], operands[1]))
+    DONE;
+})
+
+(define_insn "*movsi_internal"
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r,w,*f,*f,*r,*m,*r,*z")
+	(match_operand:SI 1 "move_operand" "r,Yd,w,rJ,*r*J,*m,*f,*f,*z,*r"))]
+  "(register_operand (operands[0], SImode)
+       || reg_or_0_operand (operands[1], SImode))"
+  { return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "move_type" "move,const,load,store,mgtf,fpload,mftg,fpstore,mftg,mgtf")
+   (set_attr "compression" "all,*,*,*,*,*,*,*,*,*")
+   (set_attr "mode" "SI")])
+
+;; 16-bit Integer moves
+
+;; Unlike most other insns, the move insns can't be split with
+;; different predicates, because register spilling and other parts of
+;; the compiler, have memoized the insn number already.
+;; Unsigned loads are used because LOAD_EXTEND_OP returns ZERO_EXTEND.
+
+(define_expand "movhi"
+  [(set (match_operand:HI 0 "")
+	(match_operand:HI 1 ""))]
+  ""
+{
+  if (loongarch_legitimize_move (HImode, operands[0], operands[1]))
+    DONE;
+})
+
+(define_insn "*movhi_internal"
+  [(set (match_operand:HI 0 "nonimmediate_operand" "=r,r,r,r,m,r,k")
+	(match_operand:HI 1 "move_operand" "r,Yd,I,m,rJ,k,rJ"))]
+  "(register_operand (operands[0], HImode)
+       || reg_or_0_operand (operands[1], HImode))"
+  { return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "move_type" "move,const,const,load,store,load,store")
+   (set_attr "compression" "all,all,*,*,*,*,*")
+   (set_attr "mode" "HI")])
+
+;; 8-bit Integer moves
+
+;; Unlike most other insns, the move insns can't be split with
+;; different predicates, because register spilling and other parts of
+;; the compiler, have memoized the insn number already.
+;; Unsigned loads are used because LOAD_EXTEND_OP returns ZERO_EXTEND.
+
+(define_expand "movqi"
+  [(set (match_operand:QI 0 "")
+	(match_operand:QI 1 ""))]
+  ""
+{
+  if (loongarch_legitimize_move (QImode, operands[0], operands[1]))
+    DONE;
+})
+
+(define_insn "*movqi_internal"
+  [(set (match_operand:QI 0 "nonimmediate_operand" "=r,r,r,m,r,k")
+	(match_operand:QI 1 "move_operand" "r,I,m,rJ,k,rJ"))]
+  "(register_operand (operands[0], QImode)
+       || reg_or_0_operand (operands[1], QImode))"
+  { return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "move_type" "move,const,load,store,load,store")
+   (set_attr "compression" "all,*,*,*,*,*")
+   (set_attr "mode" "QI")])
+
+;; 32-bit floating point moves
+
+(define_expand "movsf"
+  [(set (match_operand:SF 0 "")
+	(match_operand:SF 1 ""))]
+  ""
+{
+  if (loongarch_legitimize_move (SFmode, operands[0], operands[1]))
+    DONE;
+})
+
+(define_insn "*movsf_hardfloat"
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=f,f,f,m,f,k,m,*f,*r,*r,*r,*m")
+	(match_operand:SF 1 "move_operand" "f,G,m,f,k,f,G,*r,*f,*G*r,*m,*r"))]
+  "TARGET_HARD_FLOAT
+   && (register_operand (operands[0], SFmode)
+       || reg_or_0_operand (operands[1], SFmode))"
+  { return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "move_type" "fmove,mgtf,fpload,fpstore,fpload,fpstore,store,mgtf,mftg,move,load,store")
+   (set_attr "mode" "SF")])
+
+(define_insn "*movsf_softfloat"
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=r,r,m")
+	(match_operand:SF 1 "move_operand" "Gr,m,r"))]
+  "TARGET_SOFT_FLOAT
+   && (register_operand (operands[0], SFmode)
+       || reg_or_0_operand (operands[1], SFmode))"
+  { return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "move_type" "move,load,store")
+   (set_attr "mode" "SF")])
+
+;; 64-bit floating point moves
+
+(define_expand "movdf"
+  [(set (match_operand:DF 0 "")
+	(match_operand:DF 1 ""))]
+  ""
+{
+  if (loongarch_legitimize_move (DFmode, operands[0], operands[1]))
+    DONE;
+})
+
+(define_insn "*movdf_hardfloat"
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=f,f,f,m,f,k,m,*f,*r,*r,*r,*m")
+	(match_operand:DF 1 "move_operand" "f,G,m,f,k,f,G,*r,*f,*r*G,*m,*r"))]
+  "TARGET_DOUBLE_FLOAT
+   && (register_operand (operands[0], DFmode)
+       || reg_or_0_operand (operands[1], DFmode))"
+  { return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "move_type" "fmove,mgtf,fpload,fpstore,fpload,fpstore,store,mgtf,mftg,move,load,store")
+   (set_attr "mode" "DF")])
+
+(define_insn "*movdf_softfloat"
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=r,r,m")
+	(match_operand:DF 1 "move_operand" "rG,m,rG"))]
+  "(TARGET_SOFT_FLOAT || TARGET_SINGLE_FLOAT)
+   && (register_operand (operands[0], DFmode)
+       || reg_or_0_operand (operands[1], DFmode))"
+  { return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "move_type" "move,load,store")
+   (set_attr "mode" "DF")])
+
+
+;; 128-bit integer moves
+
+(define_expand "movti"
+  [(set (match_operand:TI 0)
+	(match_operand:TI 1))]
+  "TARGET_64BIT"
+{
+  if (loongarch_legitimize_move (TImode, operands[0], operands[1]))
+    DONE;
+})
+
+(define_insn "*movti"
+  [(set (match_operand:TI 0 "nonimmediate_operand" "=r,r,r,m")
+	(match_operand:TI 1 "move_operand" "r,i,m,rJ"))]
+  "TARGET_64BIT
+   && (register_operand (operands[0], TImode)
+       || reg_or_0_operand (operands[1], TImode))"
+  { return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "move_type" "move,const,load,store")
+   (set (attr "mode")
+    (if_then_else (eq_attr "move_type" "imul")
+		      (const_string "SI")
+		      (const_string "TI")))])
+
+;; 128-bit floating point moves
+
+(define_expand "movtf"
+  [(set (match_operand:TF 0)
+	(match_operand:TF 1))]
+  "TARGET_64BIT"
+{
+  if (loongarch_legitimize_move (TFmode, operands[0], operands[1]))
+    DONE;
+})
+
+;; This pattern handles both hard- and soft-float cases.
+(define_insn "*movtf"
+  [(set (match_operand:TF 0 "nonimmediate_operand" "=r,r,m,f,r,f,m")
+	(match_operand:TF 1 "move_operand" "rG,m,rG,rG,f,m,f"))]
+  "TARGET_64BIT
+   && (register_operand (operands[0], TFmode)
+       || reg_or_0_operand (operands[1], TFmode))"
+  "#"
+  [(set_attr "move_type" "move,load,store,mgtf,mftg,fpload,fpstore")
+   (set_attr "mode" "TF")])
+
+(define_split
+  [(set (match_operand:MOVE64 0 "nonimmediate_operand")
+	(match_operand:MOVE64 1 "move_operand"))]
+  "reload_completed && loongarch_split_move_insn_p (operands[0], operands[1], insn)"
+  [(const_int 0)]
+{
+  loongarch_split_move_insn (operands[0], operands[1], curr_insn);
+  DONE;
+})
+
+(define_split
+  [(set (match_operand:MOVE128 0 "nonimmediate_operand")
+	(match_operand:MOVE128 1 "move_operand"))]
+  "reload_completed && loongarch_split_move_insn_p (operands[0], operands[1], insn)"
+  [(const_int 0)]
+{
+  loongarch_split_move_insn (operands[0], operands[1], curr_insn);
+  DONE;
+})
+
+;; Emit a doubleword move in which exactly one of the operands is
+;; a floating-point register.  We can't just emit two normal moves
+;; because of the constraints imposed by the FPU register model;
+;; see loongarch_can_change_mode_class for details.  Instead, we keep
+;; the FPR whole and use special patterns to refer to each word of
+;; the other operand.
+
+(define_expand "move_doubleword_fpr<mode>"
+  [(set (match_operand:SPLITF 0)
+	(match_operand:SPLITF 1))]
+  ""
+{
+  if (FP_REG_RTX_P (operands[0]))
+    {
+      rtx low = loongarch_subword (operands[1], 0);
+      rtx high = loongarch_subword (operands[1], 1);
+      emit_insn (gen_load_low<mode> (operands[0], low));
+      if (!TARGET_64BIT)
+       emit_insn (gen_movgr2frh<mode> (operands[0], high, operands[0]));
+      else
+       emit_insn (gen_load_high<mode> (operands[0], high, operands[0]));
+    }
+  else
+    {
+      rtx low = loongarch_subword (operands[0], 0);
+      rtx high = loongarch_subword (operands[0], 1);
+      emit_insn (gen_store_word<mode> (low, operands[1], const0_rtx));
+      if (!TARGET_64BIT)
+       emit_insn (gen_movfrh2gr<mode> (high, operands[1]));
+      else
+       emit_insn (gen_store_word<mode> (high, operands[1], const1_rtx));
+    }
+  DONE;
+})
+
+;; Conditional move instructions.
+
+(define_insn "*sel<code><GPR:mode>_using_<GPR2:mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=r,r")
+	(if_then_else:GPR
+	 (equality_op:GPR2 (match_operand:GPR2 1 "register_operand" "r,r")
+			   (const_int 0))
+	 (match_operand:GPR 2 "reg_or_0_operand" "r,J")
+	 (match_operand:GPR 3 "reg_or_0_operand" "J,r")))]
+  "register_operand (operands[2], <GPR:MODE>mode)
+       != register_operand (operands[3], <GPR:MODE>mode)"
+  "@
+   <sel>\t%0,%2,%1
+   <selinv>\t%0,%3,%1"
+  [(set_attr "type" "condmove")
+   (set_attr "mode" "<GPR:MODE>")])
+
+;; sel.fmt copies the 3rd argument when the 1st is non-zero and the 2nd
+;; argument if the 1st is zero.  This means operand 2 and 3 are
+;; inverted in the instruction.
+
+(define_insn "*sel<mode>"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(if_then_else:ANYF
+	 (ne:FCC (match_operand:FCC 1 "register_operand" "z")
+		 (const_int 0))
+	 (match_operand:ANYF 2 "reg_or_0_operand" "f")
+	 (match_operand:ANYF 3 "reg_or_0_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "fsel\t%0,%3,%2,%1"
+  [(set_attr "type" "condmove")
+   (set_attr "mode" "<ANYF:MODE>")])
+
+;; These are the main define_expand's used to make conditional moves.
+
+(define_expand "mov<mode>cc"
+  [(set (match_operand:GPR 0 "register_operand")
+	(if_then_else:GPR (match_operator 1 "comparison_operator"
+			 [(match_operand:GPR 2 "reg_or_0_operand")
+			  (match_operand:GPR 3 "reg_or_0_operand")])))]
+  "TARGET_COND_MOVE_INT"
+{
+  if (!INTEGRAL_MODE_P (GET_MODE (XEXP (operands[1], 0))))
+    FAIL;
+
+  loongarch_expand_conditional_move (operands);
+  DONE;
+})
+
+(define_expand "mov<mode>cc"
+  [(set (match_operand:ANYF 0 "register_operand")
+	(if_then_else:ANYF (match_operator 1 "comparison_operator"
+			  [(match_operand:ANYF 2 "reg_or_0_operand")
+			   (match_operand:ANYF 3 "reg_or_0_operand")])))]
+  "TARGET_COND_MOVE_FLOAT"
+{
+  if (!FLOAT_MODE_P (GET_MODE (XEXP (operands[1], 0))))
+    FAIL;
+
+  loongarch_expand_conditional_move (operands);
+  DONE;
+})
+
+(define_insn "lu32i_d"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(ior:DI
+	  (zero_extend:DI
+	    (subreg:SI (match_operand:DI 1 "register_operand" "0") 0))
+	  (match_operand:DI 2 "const_lu32i_operand" "u")))]
+  "TARGET_64BIT"
+  "lu32i.d\t%0,%X2>>32"
+  [(set_attr "type" "arith")
+   (set_attr "mode" "DI")])
+
+(define_insn "lu52i_d"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(ior:DI
+	  (and:DI (match_operand:DI 1 "register_operand" "r")
+		  (match_operand 2 "lu52i_mask_operand"))
+	  (match_operand 3 "const_lu52i_operand" "v")))]
+    "TARGET_64BIT"
+    "lu52i.d\t%0,%1,%X3>>52"
+    [(set_attr "type" "arith")
+     (set_attr "mode" "DI")])
+
+;; Convert floating-point numbers to integers
+(define_insn "frint_<fmt>"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(unspec:ANYF [(match_operand:ANYF 1 "register_operand" "f")]
+		      UNSPEC_FRINT))]
+  "TARGET_HARD_FLOAT"
+  "frint.<fmt>\t%0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "<MODE>")])
+
+;; LoongArch supports loading and storing a floating point register from
+;; the sum of two general-purpose registers.  We use two versions for each of
+;; these four instructions: one where the two general-purpose registers are
+;; SImode, and one where they are DImode.  This is because general-purpose
+;; registers will be in SImode when they hold 32-bit values, but,
+;; since the 32-bit values are always sign extended, the f{ld/st}x.{s/d}
+;; instructions will still work correctly.
+
+;; ??? Perhaps it would be better to support these instructions by
+;; modifying TARGET_LEGITIMATE_ADDRESS_P and friends.  However, since
+;; these instructions can only be used to load and store floating
+;; point registers, that would probably cause trouble in reload.
+
+(define_insn "*<ANYF:floadx>_<P:mode>"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(mem:ANYF (plus:P (match_operand:P 1 "register_operand" "r")
+			  (match_operand:P 2 "register_operand" "r"))))]
+  "TARGET_HARD_FLOAT"
+  "<ANYF:floadx>\t%0,%1,%2"
+  [(set_attr "type" "fpidxload")
+   (set_attr "mode" "<ANYF:UNITMODE>")])
+
+(define_insn "*<ANYF:fstorex>_<P:mode>"
+  [(set (mem:ANYF (plus:P (match_operand:P 1 "register_operand" "r")
+			  (match_operand:P 2 "register_operand" "r")))
+	(match_operand:ANYF 0 "register_operand" "f"))]
+  "TARGET_HARD_FLOAT"
+  "<ANYF:fstorex>\t%0,%1,%2"
+  [(set_attr "type" "fpidxstore")
+   (set_attr "mode" "<ANYF:UNITMODE>")])
+
+;; loading and storing a integer register from the sum of two general-purpose
+;; registers.
+
+(define_insn "*<GPR:loadx>_<P:mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(mem:GPR
+	    (plus:P (match_operand:P 1 "register_operand" "r")
+		    (match_operand:P 2 "register_operand" "r"))))]
+  ""
+  "<GPR:loadx>\t%0,%1,%2"
+  [(set_attr "type" "load")
+   (set_attr "mode" "<GPR:MODE>")])
+
+(define_insn "*<GPR:storex>_<P:mode>"
+  [(set (mem:GPR (plus:P (match_operand:P 1 "register_operand" "r")
+			 (match_operand:P 2 "register_operand" "r")))
+	(match_operand:GPR 0 "reg_or_0_operand" "rJ"))]
+  ""
+  "<GPR:storex>\t%z0,%1,%2"
+  [(set_attr "type" "store")
+   (set_attr "mode" "<GPR:MODE>")])
+
+;; SHORT mode sign_extend.
+(define_insn "*extend_<SHORT:loadx>_<GPR:mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(sign_extend:GPR
+	  (mem:SHORT
+	    (plus:P (match_operand:P 1 "register_operand" "r")
+		    (match_operand:P 2 "register_operand" "r")))))]
+  ""
+  "<SHORT:loadx>\t%0,%1,%2"
+  [(set_attr "type" "load")
+   (set_attr "mode" "<GPR:MODE>")])
+
+(define_insn "*extend_<SHORT:storex>"
+  [(set (mem:SHORT (plus:P (match_operand:P 1 "register_operand" "r")
+			   (match_operand:P 2 "register_operand" "r")))
+	(match_operand:SHORT 0 "reg_or_0_operand" "rJ"))]
+  ""
+  "<SHORT:storex>\t%z0,%1,%2"
+  [(set_attr "type" "store")
+   (set_attr "mode" "SI")])
+
+;; Load the low word of operand 0 with operand 1.
+(define_insn "load_low<mode>"
+  [(set (match_operand:SPLITF 0 "register_operand" "=f,f")
+	(unspec:SPLITF [(match_operand:<HALFMODE> 1 "general_operand" "rJ,m")]
+		       UNSPEC_LOAD_LOW))]
+  "TARGET_HARD_FLOAT"
+{
+  operands[0] = loongarch_subword (operands[0], 0);
+  return loongarch_output_move (operands[0], operands[1]);
+}
+  [(set_attr "move_type" "mgtf,fpload")
+   (set_attr "mode" "<HALFMODE>")])
+
+;; Load the high word of operand 0 from operand 1, preserving the value
+;; in the low word.
+(define_insn "load_high<mode>"
+  [(set (match_operand:SPLITF 0 "register_operand" "=f,f")
+	(unspec:SPLITF [(match_operand:<HALFMODE> 1 "general_operand" "rJ,m")
+			(match_operand:SPLITF 2 "register_operand" "0,0")]
+		       UNSPEC_LOAD_HIGH))]
+  "TARGET_HARD_FLOAT"
+{
+  operands[0] = loongarch_subword (operands[0], 1);
+  return loongarch_output_move (operands[0], operands[1]);
+}
+  [(set_attr "move_type" "mgtf,fpload")
+   (set_attr "mode" "<HALFMODE>")])
+
+;; Store one word of operand 1 in operand 0.  Operand 2 is 1 to store the
+;; high word and 0 to store the low word.
+(define_insn "store_word<mode>"
+  [(set (match_operand:<HALFMODE> 0 "nonimmediate_operand" "=r,m")
+	(unspec:<HALFMODE> [(match_operand:SPLITF 1 "register_operand" "f,f")
+			    (match_operand 2 "const_int_operand")]
+			   UNSPEC_STORE_WORD))]
+  "TARGET_HARD_FLOAT"
+{
+  operands[1] = loongarch_subword (operands[1], INTVAL (operands[2]));
+  return loongarch_output_move (operands[0], operands[1]);
+}
+  [(set_attr "move_type" "mftg,fpstore")
+   (set_attr "mode" "<HALFMODE>")])
+
+;; Thread-Local Storage
+
+(define_insn "got_load_tls_gd<mode>"
+  [(set (match_operand:P 0 "register_operand" "=r")
+	(unspec:P
+	    [(match_operand:P 1 "symbolic_operand" "")]
+	    UNSPEC_TLS_GD))]
+  ""
+  "la.tls.gd\t%0,%1"
+  [(set_attr "got" "load")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "got_load_tls_ld<mode>"
+  [(set (match_operand:P 0 "register_operand" "=r")
+	(unspec:P
+	    [(match_operand:P 1 "symbolic_operand" "")]
+	    UNSPEC_TLS_LD))]
+  ""
+  "la.tls.ld\t%0,%1"
+  [(set_attr "got" "load")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "got_load_tls_le<mode>"
+  [(set (match_operand:P 0 "register_operand" "=r")
+	(unspec:P
+	    [(match_operand:P 1 "symbolic_operand" "")]
+	    UNSPEC_TLS_LE))]
+  ""
+  "la.tls.le\t%0,%1"
+  [(set_attr "got" "load")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "got_load_tls_ie<mode>"
+  [(set (match_operand:P 0 "register_operand" "=r")
+	(unspec:P
+	    [(match_operand:P 1 "symbolic_operand" "")]
+	    UNSPEC_TLS_IE))]
+  ""
+  "la.tls.ie\t%0,%1"
+  [(set_attr "got" "load")
+   (set_attr "mode" "<MODE>")])
+
+;; Move operand 1 to the high word of operand 0 using movgr2frh.w, preserving the
+;; value in the low word.
+(define_insn "movgr2frh<mode>"
+  [(set (match_operand:SPLITF 0 "register_operand" "=f")
+	(unspec:SPLITF [(match_operand:<HALFMODE> 1 "reg_or_0_operand" "rJ")
+			(match_operand:SPLITF 2 "register_operand" "0")]
+			UNSPEC_MOVGR2FRH))]
+  "TARGET_DOUBLE_FLOAT"
+  "movgr2frh.w\t%z1,%0"
+  [(set_attr "move_type" "mgtf")
+   (set_attr "mode" "<HALFMODE>")])
+
+;; Move high word of operand 1 to operand 0 using movfrh2gr.s.
+(define_insn "movfrh2gr<mode>"
+  [(set (match_operand:<HALFMODE> 0 "register_operand" "=r")
+	(unspec:<HALFMODE> [(match_operand:SPLITF 1 "register_operand" "f")]
+			    UNSPEC_MOVFRH2GR))]
+  "TARGET_DOUBLE_FLOAT"
+  "movfrh2gr.s\t%0,%1"
+  [(set_attr "move_type" "mftg")
+   (set_attr "mode" "<HALFMODE>")])
+
+
+;; Expand in-line code to clear the instruction cache between operand[0] and
+;; operand[1].
+(define_expand "clear_cache"
+  [(match_operand 0 "pmode_register_operand")
+   (match_operand 1 "pmode_register_operand")]
+  ""
+  "
+{
+  emit_insn (gen_ibar (const0_rtx));
+  DONE;
+}")
+
+(define_insn "ibar"
+  [(unspec_volatile:SI [(match_operand 0 "const_uimm15_operand")] UNSPECV_IBAR)]
+  ""
+  "ibar\t%0")
+
+(define_insn "dbar"
+  [(unspec_volatile:SI [(match_operand 0 "const_uimm15_operand")] UNSPECV_DBAR)]
+  ""
+  "dbar\t%0")
+
+
+
+;; Privileged state instruction
+
+(define_insn "cpucfg"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(unspec_volatile:SI [(match_operand:SI 1 "register_operand" "r")]
+			     UNSPECV_CPUCFG))]
+  ""
+  "cpucfg\t%0,%1"
+  [(set_attr "type" "load")
+   (set_attr "mode" "SI")])
+
+(define_insn "asrtle_d"
+	[(unspec_volatile:DI [(match_operand:DI 0 "register_operand" "r")
+			      (match_operand:DI 1 "register_operand" "r")]
+			      UNSPECV_ASRTLE_D)]
+  "TARGET_64BIT"
+  "asrtle.d\t%0,%1"
+  [(set_attr "type" "load")
+   (set_attr "mode" "DI")])
+
+(define_insn "asrtgt_d"
+	[(unspec_volatile:DI [(match_operand:DI 0 "register_operand" "r")
+			      (match_operand:DI 1 "register_operand" "r")]
+			      UNSPECV_ASRTGT_D)]
+  "TARGET_64BIT"
+  "asrtgt.d\t%0,%1"
+  [(set_attr "type" "load")
+   (set_attr "mode" "DI")])
+
+(define_insn "<p>csrrd"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(unspec_volatile:GPR [(match_operand  1 "const_uimm14_operand")]
+			     UNSPECV_CSRRD))]
+  ""
+  "csrrd\t%0,%1"
+  [(set_attr "type" "load")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "<p>csrwr"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	  (unspec_volatile:GPR
+	  [(match_operand:GPR 1 "register_operand" "0")
+	   (match_operand 2 "const_uimm14_operand")]
+	  UNSPECV_CSRWR))]
+  ""
+  "csrwr\t%0,%2"
+  [(set_attr "type" "store")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "<p>csrxchg"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	  (unspec_volatile:GPR
+	  [(match_operand:GPR 1 "register_operand" "0")
+	   (match_operand:GPR 2 "register_operand" "q")
+	   (match_operand 3 "const_uimm14_operand")]
+	  UNSPECV_CSRXCHG))]
+  ""
+  "csrxchg\t%0,%2,%3"
+  [(set_attr "type" "load")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "iocsrrd_<size>"
+  [(set (match_operand:QHWD 0 "register_operand" "=r")
+	(unspec_volatile:QHWD [(match_operand:SI 1 "register_operand" "r")]
+			      UNSPECV_IOCSRRD))]
+  ""
+  "iocsrrd.<size>\t%0,%1"
+  [(set_attr "type" "load")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "iocsrwr_<size>"
+  [(unspec_volatile:QHWD [(match_operand:QHWD 0 "register_operand" "r")
+			  (match_operand:SI 1 "register_operand" "r")]
+			UNSPECV_IOCSRWR)]
+  ""
+  "iocsrwr.<size>\t%0,%1"
+  [(set_attr "type" "load")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "<p>cacop"
+  [(unspec_volatile:X [(match_operand 0 "const_uimm5_operand")
+			 (match_operand:X 1 "register_operand" "r")
+			 (match_operand 2 "const_imm12_operand")]
+			 UNSPECV_CACOP)]
+  ""
+  "cacop\t%0,%1,%2"
+  [(set_attr "type" "load")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "<p>lddir"
+  [(unspec_volatile:X [(match_operand:X 0 "register_operand" "r")
+			 (match_operand:X 1 "register_operand" "r")
+			 (match_operand 2 "const_uimm5_operand")]
+			 UNSPECV_LDDIR)]
+  ""
+  "lddir\t%0,%1,%2"
+  [(set_attr "type" "load")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "<p>ldpte"
+  [(unspec_volatile:X [(match_operand:X 0 "register_operand" "r")
+			 (match_operand 1 "const_uimm5_operand")]
+			 UNSPECV_LDPTE)]
+  ""
+  "ldpte\t%0,%1"
+  [(set_attr "type" "load")
+   (set_attr "mode" "<MODE>")])
+
+
+;; Block moves, see loongarch.c for more details.
+;; Argument 0 is the destination.
+;; Argument 1 is the source.
+;; Argument 2 is the length.
+;; Argument 3 is the alignment.
+
+(define_expand "cpymemsi"
+  [(parallel [(set (match_operand:BLK 0 "general_operand")
+		   (match_operand:BLK 1 "general_operand"))
+	      (use (match_operand:SI 2 ""))
+	      (use (match_operand:SI 3 "const_int_operand"))])]
+  " !TARGET_MEMCPY"
+{
+  if (loongarch_expand_block_move (operands[0], operands[1], operands[2]))
+    DONE;
+  else
+    FAIL;
+})
+
+;;
+;;  ....................
+;;
+;;	SHIFTS
+;;
+;;  ....................
+
+(define_insn "<optab><mode>3"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(any_shift:GPR (match_operand:GPR 1 "register_operand" "r")
+		       (match_operand:SI 2 "arith_operand" "rI")))]
+  ""
+{
+  if (CONST_INT_P (operands[2]))
+    operands[2] = GEN_INT (INTVAL (operands[2])
+			   & (GET_MODE_BITSIZE (<MODE>mode) - 1));
+
+  return "<insn>%i2.<d>\t%0,%1,%2";
+}
+  [(set_attr "type" "shift")
+   (set_attr "compression" "none")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "*<optab>si3_extend"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(sign_extend:DI
+	   (any_shift:SI (match_operand:SI 1 "register_operand" "r")
+			 (match_operand:SI 2 "arith_operand" "rI"))))]
+  "TARGET_64BIT"
+{
+  if (CONST_INT_P (operands[2]))
+    operands[2] = GEN_INT (INTVAL (operands[2]) & 0x1f);
+
+  return "<insn>%i2.w\t%0,%1,%2";
+}
+  [(set_attr "type" "shift")
+   (set_attr "mode" "SI")])
+
+(define_insn "rotr<mode>3"
+  [(set (match_operand:GPR 0 "register_operand" "=r,r")
+	(rotatert:GPR (match_operand:GPR 1 "register_operand" "r,r")
+		      (match_operand:SI 2 "arith_operand" "r,I")))]
+  ""
+  "rotr%i2.<d>\t%0,%1,%2"
+  [(set_attr "type" "shift,shift")
+   (set_attr "mode" "<MODE>")])
+
+;; The following templates were added to generate "bstrpick.d + alsl.d"
+;; instruction pairs.
+;; It is required that the values of const_immalsl_operand and
+;; immediate_operand must have the following correspondence:
+;;
+;; (immediate_operand >> const_immalsl_operand) == 0xffffffff
+
+(define_insn "zero_extend_ashift1"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(and:DI (ashift:DI (subreg:DI (match_operand:SI 1 "register_operand" "r") 0)
+			   (match_operand 2 "const_immalsl_operand" ""))
+		(match_operand 3 "immediate_operand" "")))]
+  "TARGET_64BIT
+   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)"
+  "bstrpick.d\t%0,%1,31,0\n\talsl.d\t%0,%0,$r0,%2"
+  [(set_attr "type" "arith")
+   (set_attr "mode" "DI")
+   (set_attr "insn_count" "2")])
+
+(define_insn "zero_extend_ashift2"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r")
+			   (match_operand 2 "const_immalsl_operand" ""))
+		(match_operand 3 "immediate_operand" "")))]
+  "TARGET_64BIT
+   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)"
+  "bstrpick.d\t%0,%1,31,0\n\talsl.d\t%0,%0,$r0,%2"
+  [(set_attr "type" "arith")
+   (set_attr "mode" "DI")
+   (set_attr "insn_count" "2")])
+
+(define_insn "alsl_paired1"
+  [(set (match_operand:DI 0 "register_operand" "=&r")
+	(plus:DI (and:DI (ashift:DI (subreg:DI (match_operand:SI 1 "register_operand" "r") 0)
+				    (match_operand 2 "const_immalsl_operand" ""))
+			 (match_operand 3 "immediate_operand" ""))
+		 (match_operand:DI 4 "register_operand" "r")))]
+  "TARGET_64BIT
+   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)"
+  "bstrpick.d\t%0,%1,31,0\n\talsl.d\t%0,%0,%4,%2"
+  [(set_attr "type" "arith")
+  (set_attr "mode" "DI")
+  (set_attr "insn_count" "2")])
+
+(define_insn "alsl_paired2"
+  [(set (match_operand:DI 0 "register_operand" "=&r")
+	(plus:DI (match_operand:DI 1 "register_operand" "r")
+		 (and:DI (ashift:DI (match_operand:DI 2 "register_operand" "r")
+				    (match_operand 3 "const_immalsl_operand" ""))
+			 (match_operand 4 "immediate_operand" ""))))]
+  "TARGET_64BIT
+   && ((INTVAL (operands[4]) >> INTVAL (operands[3])) == 0xffffffff)"
+  "bstrpick.d\t%0,%2,31,0\n\talsl.d\t%0,%0,%1,%3"
+  [(set_attr "type" "arith")
+   (set_attr "mode" "DI")
+   (set_attr "insn_count" "2")])
+
+(define_insn "alsl<mode>3"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(plus:GPR (ashift:GPR (match_operand:GPR 1 "register_operand" "r")
+			      (match_operand 2 "const_immalsl_operand" ""))
+		  (match_operand:GPR 3 "register_operand" "r")))]
+  ""
+  "alsl.<d>\t%0,%1,%3,%2"
+  [(set_attr "type" "arith")
+   (set_attr "mode" "<MODE>")])
+
+
+
+;; Reverse the order of bytes of operand 1 and store the result in operand 0.
+
+(define_insn "bswaphi2"
+  [(set (match_operand:HI 0 "register_operand" "=r")
+	(bswap:HI (match_operand:HI 1 "register_operand" "r")))]
+  ""
+  "revb.2h\t%0,%1"
+  [(set_attr "type" "shift")])
+
+(define_insn_and_split "bswapsi2"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(bswap:SI (match_operand:SI 1 "register_operand" "r")))]
+  ""
+  "#"
+  ""
+  [(set (match_dup 0) (unspec:SI [(match_dup 1)] UNSPEC_REVB_2H))
+   (set (match_dup 0) (rotatert:SI (match_dup 0) (const_int 16)))]
+  ""
+  [(set_attr "insn_count" "2")])
+
+(define_insn_and_split "bswapdi2"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(bswap:DI (match_operand:DI 1 "register_operand" "r")))]
+  "TARGET_64BIT"
+  "#"
+  ""
+  [(set (match_dup 0) (unspec:DI [(match_dup 1)] UNSPEC_REVB_4H))
+   (set (match_dup 0) (unspec:DI [(match_dup 0)] UNSPEC_REVH_D))]
+  ""
+  [(set_attr "insn_count" "2")])
+
+(define_insn "revb_2h"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(unspec:SI [(match_operand:SI 1 "register_operand" "r")] UNSPEC_REVB_2H))]
+  ""
+  "revb.2h\t%0,%1"
+  [(set_attr "type" "shift")])
+
+(define_insn "revb_4h"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(unspec:DI [(match_operand:DI 1 "register_operand" "r")] UNSPEC_REVB_4H))]
+  "TARGET_64BIT"
+  "revb.4h\t%0,%1"
+  [(set_attr "type" "shift")])
+
+(define_insn "revh_d"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(unspec:DI [(match_operand:DI 1 "register_operand" "r")] UNSPEC_REVH_D))]
+  "TARGET_64BIT"
+  "revh.d\t%0,%1"
+  [(set_attr "type" "shift")])
+
+;;
+;;  ....................
+;;
+;;	CONDITIONAL BRANCHES
+;;
+;;  ....................
+
+;; Conditional branches
+
+(define_insn "*branch_fp_FCCmode"
+  [(set (pc)
+	(if_then_else
+	  (match_operator 1 "equality_operator"
+	      [(match_operand:FCC 2 "register_operand" "z")
+		(const_int 0)])
+	  (label_ref (match_operand 0 "" ""))
+	(pc)))]
+  "TARGET_HARD_FLOAT"
+{
+  return loongarch_output_conditional_branch (insn, operands,
+					      LARCH_BRANCH ("b%F1", "%Z2%0"),
+					      LARCH_BRANCH ("b%W1", "%Z2%0"));
+}
+  [(set_attr "type" "branch")])
+
+(define_insn "*branch_fp_inverted_FCCmode"
+  [(set (pc)
+	(if_then_else
+	  (match_operator 1 "equality_operator"
+	    [(match_operand:FCC 2 "register_operand" "z")
+	    (const_int 0)])
+	    (pc)
+	  (label_ref (match_operand 0 "" ""))))]
+  "TARGET_HARD_FLOAT"
+{
+  return loongarch_output_conditional_branch (insn, operands,
+					      LARCH_BRANCH ("b%W1", "%Z2%0"),
+					      LARCH_BRANCH ("b%F1", "%Z2%0"));
+}
+  [(set_attr "type" "branch")])
+
+;; Conditional branches on ordered comparisons with zero.
+
+(define_insn "*branch_order<mode>"
+  [(set (pc)
+	(if_then_else
+	 (match_operator 1 "order_operator"
+			 [(match_operand:GPR 2 "register_operand" "r,r")
+			  (match_operand:GPR 3 "reg_or_0_operand" "J,r")])
+	 (label_ref (match_operand 0 "" ""))
+	 (pc)))]
+  ""
+  { return loongarch_output_order_conditional_branch (insn, operands, false); }
+  [(set_attr "type" "branch")])
+
+(define_insn "*branch_order<mode>_inverted"
+  [(set (pc)
+	(if_then_else
+	 (match_operator 1 "order_operator"
+			 [(match_operand:GPR 2 "register_operand" "r,r")
+			  (match_operand:GPR 3 "reg_or_0_operand" "J,r")])
+	 (pc)
+	 (label_ref (match_operand 0 "" ""))))]
+  ""
+  { return loongarch_output_order_conditional_branch (insn, operands, true); }
+  [(set_attr "type" "branch")])
+
+;; Conditional branch on equality comparison.
+
+(define_insn "*branch_equality<mode>"
+  [(set (pc)
+	(if_then_else
+	 (match_operator 1 "equality_operator"
+			 [(match_operand:GPR 2 "register_operand" "r")
+			  (match_operand:GPR 3 "reg_or_0_operand" "rJ")])
+	 (label_ref (match_operand 0 "" ""))
+	 (pc)))]
+  ""
+  { return loongarch_output_equal_conditional_branch (insn, operands, false); }
+  [(set_attr "type" "branch")])
+
+
+(define_insn "*branch_equality<mode>_inverted"
+  [(set (pc)
+	(if_then_else
+	 (match_operator 1 "equality_operator"
+			 [(match_operand:GPR 2 "register_operand" "r")
+			  (match_operand:GPR 3 "reg_or_0_operand" "rJ")])
+	 (pc)
+	 (label_ref (match_operand 0 "" ""))))]
+  ""
+  { return loongarch_output_equal_conditional_branch (insn, operands, true); }
+  [(set_attr "type" "branch")])
+
+
+(define_expand "cbranch<mode>4"
+  [(set (pc)
+	(if_then_else (match_operator 0 "comparison_operator"
+		      [(match_operand:GPR 1 "register_operand")
+			(match_operand:GPR 2 "nonmemory_operand")])
+		      (label_ref (match_operand 3 ""))
+		      (pc)))]
+  ""
+{
+  loongarch_expand_conditional_branch (operands);
+  DONE;
+})
+
+(define_expand "cbranch<mode>4"
+  [(set (pc)
+	(if_then_else (match_operator 0 "comparison_operator"
+			[(match_operand:ANYF 1 "register_operand")
+			(match_operand:ANYF 2 "register_operand")])
+		      (label_ref (match_operand 3 ""))
+		      (pc)))]
+  ""
+{
+  loongarch_expand_conditional_branch (operands);
+  DONE;
+})
+
+;; Used to implement built-in functions.
+(define_expand "condjump"
+  [(set (pc)
+	(if_then_else (match_operand 0)
+		      (label_ref (match_operand 1))
+		      (pc)))])
+
+
+
+;;
+;;  ....................
+;;
+;;	SETTING A REGISTER FROM A COMPARISON
+;;
+;;  ....................
+
+;; Destination is always set in SI mode.
+
+(define_expand "cstore<mode>4"
+  [(set (match_operand:SI 0 "register_operand")
+	(match_operator:SI 1 "loongarch_cstore_operator"
+	 [(match_operand:GPR 2 "register_operand")
+	  (match_operand:GPR 3 "nonmemory_operand")]))]
+  ""
+{
+  loongarch_expand_scc (operands);
+  DONE;
+})
+
+(define_insn "*seq_zero_<GPR:mode><GPR2:mode>"
+  [(set (match_operand:GPR2 0 "register_operand" "=r")
+	(eq:GPR2 (match_operand:GPR 1 "register_operand" "r")
+		 (const_int 0)))]
+  ""
+  "sltui\t%0,%1,1"
+  [(set_attr "type" "slt")
+   (set_attr "mode" "<GPR:MODE>")])
+
+
+(define_insn "*sne_zero_<GPR:mode><GPR2:mode>"
+  [(set (match_operand:GPR2 0 "register_operand" "=r")
+	(ne:GPR2 (match_operand:GPR 1 "register_operand" "r")
+		 (const_int 0)))]
+  ""
+  "sltu\t%0,%.,%1"
+  [(set_attr "type" "slt")
+   (set_attr "mode" "<GPR:MODE>")])
+
+(define_insn "*sgt<u>_<GPR:mode><GPR2:mode>"
+  [(set (match_operand:GPR2 0 "register_operand" "=r")
+	(any_gt:GPR2 (match_operand:GPR 1 "register_operand" "r")
+		     (match_operand:GPR 2 "reg_or_0_operand" "rJ")))]
+  ""
+  "slt<u>\t%0,%z2,%1"
+  [(set_attr "type" "slt")
+   (set_attr "mode" "<GPR:MODE>")])
+
+(define_insn "*sge<u>_<GPR:mode><GPR2:mode>"
+  [(set (match_operand:GPR2 0 "register_operand" "=r")
+	(any_ge:GPR2 (match_operand:GPR 1 "register_operand" "r")
+		     (const_int 1)))]
+  ""
+  "slt<u>i\t%0,%.,%1"
+  [(set_attr "type" "slt")
+   (set_attr "mode" "<GPR:MODE>")])
+
+(define_insn "*slt<u>_<GPR:mode><GPR2:mode>"
+  [(set (match_operand:GPR2 0 "register_operand" "=r")
+	(any_lt:GPR2 (match_operand:GPR 1 "register_operand" "r")
+		     (match_operand:GPR 2 "arith_operand" "rI")))]
+  ""
+  "slt<u>%i2\t%0,%1,%2";
+  [(set_attr "type" "slt")
+   (set_attr "mode" "<GPR:MODE>")])
+
+(define_insn "*sle<u>_<GPR:mode><GPR2:mode>"
+  [(set (match_operand:GPR2 0 "register_operand" "=r")
+	(any_le:GPR2 (match_operand:GPR 1 "register_operand" "r")
+		     (match_operand:GPR 2 "sle_operand" "")))]
+  ""
+{
+  operands[2] = GEN_INT (INTVAL (operands[2]) + 1);
+  return "slt<u>i\t%0,%1,%2";
+}
+  [(set_attr "type" "slt")
+   (set_attr "mode" "<GPR:MODE>")])
+
+
+;;
+;;  ....................
+;;
+;;	FLOATING POINT COMPARISONS
+;;
+;;  ....................
+
+(define_insn "s<code>_<ANYF:mode>_using_FCCmode"
+  [(set (match_operand:FCC 0 "register_operand" "=z")
+	(fcond:FCC (match_operand:ANYF 1 "register_operand" "f")
+		    (match_operand:ANYF 2 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "fcmp.<fcond>.<fmt>\t%Z0%1,%2"
+  [(set_attr "type" "fcmp")
+   (set_attr "mode" "FCC")])
+
+
+;;
+;;  ....................
+;;
+;;	UNCONDITIONAL BRANCHES
+;;
+;;  ....................
+
+;; Unconditional branches.
+
+(define_expand "jump"
+  [(set (pc)
+	(label_ref (match_operand 0)))])
+
+(define_insn "*jump_absolute"
+  [(set (pc)
+	(label_ref (match_operand 0)))]
+  "!flag_pic"
+{
+  return "b\t%l0";
+}
+  [(set_attr "type" "branch")])
+
+(define_insn "*jump_pic"
+  [(set (pc)
+	(label_ref (match_operand 0)))]
+  "flag_pic"
+{
+  return "b\t%0";
+}
+  [(set_attr "type" "branch")])
+
+(define_expand "indirect_jump"
+  [(set (pc) (match_operand 0 "register_operand"))]
+  ""
+{
+  operands[0] = force_reg (Pmode, operands[0]);
+  emit_jump_insn (PMODE_INSN (gen_indirect_jump, (operands[0])));
+  DONE;
+})
+
+(define_insn "indirect_jump_<mode>"
+  [(set (pc) (match_operand:P 0 "register_operand" "r"))]
+  ""
+  {
+    return "jr\t%0";
+  }
+  [(set_attr "type" "jump")
+   (set_attr "mode" "none")])
+
+(define_expand "tablejump"
+  [(set (pc)
+	(match_operand 0 "register_operand"))
+   (use (label_ref (match_operand 1 "")))]
+  ""
+{
+  if (flag_pic)
+      operands[0] = expand_simple_binop (Pmode, PLUS, operands[0],
+					 gen_rtx_LABEL_REF (Pmode,
+							    operands[1]),
+					 NULL_RTX, 0, OPTAB_DIRECT);
+  emit_jump_insn (PMODE_INSN (gen_tablejump, (operands[0], operands[1])));
+  DONE;
+})
+
+(define_insn "tablejump_<mode>"
+  [(set (pc)
+	(match_operand:P 0 "register_operand" "r"))
+   (use (label_ref (match_operand 1 "" "")))]
+  ""
+  {
+    return "jr\t%0";
+  }
+  [(set_attr "type" "jump")
+   (set_attr "mode" "none")])
+
+
+
+;;
+;;  ....................
+;;
+;;	Function prologue/epilogue
+;;
+;;  ....................
+;;
+
+(define_expand "prologue"
+  [(const_int 1)]
+  ""
+{
+  loongarch_expand_prologue ();
+  DONE;
+})
+
+;; Block any insns from being moved before this point, since the
+;; profiling call to mcount can use various registers that aren't
+;; saved or used to pass arguments.
+
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)]
+  ""
+  ""
+  [(set_attr "type" "ghost")
+   (set_attr "mode" "none")])
+
+(define_insn "probe_stack_range_<P:mode>"
+  [(set (match_operand:P 0 "register_operand" "=r")
+	(unspec_volatile:P [(match_operand:P 1 "register_operand" "0")
+			    (match_operand:P 2 "register_operand" "r")
+			    (match_operand:P 3 "register_operand" "r")]
+			    UNSPECV_PROBE_STACK_RANGE))]
+  ""
+{
+  return loongarch_output_probe_stack_range (operands[0],
+					     operands[2],
+					     operands[3]);
+}
+  [(set_attr "type" "unknown")
+   (set_attr "mode" "<MODE>")])
+
+(define_expand "epilogue"
+  [(const_int 2)]
+  ""
+{
+  loongarch_expand_epilogue (false);
+  DONE;
+})
+
+(define_expand "sibcall_epilogue"
+  [(const_int 2)]
+  ""
+{
+  loongarch_expand_epilogue (true);
+  DONE;
+})
+
+;; Trivial return.  Make it look like a normal return insn as that
+;; allows jump optimizations to work better.
+
+(define_expand "return"
+  [(simple_return)]
+  "loongarch_can_use_return_insn ()"
+  { })
+
+(define_expand "simple_return"
+  [(simple_return)]
+  ""
+  { })
+
+(define_insn "*<optab>"
+  [(any_return)]
+  ""
+  {
+    operands[0] = gen_rtx_REG (Pmode, RETURN_ADDR_REGNUM);
+    return "jr\t%0";
+  }
+  [(set_attr "type"	"jump")
+   (set_attr "mode"	"none")])
+
+;; Normal return.
+
+(define_insn "<optab>_internal"
+  [(any_return)
+   (use (match_operand 0 "pmode_register_operand" ""))]
+  ""
+  {
+    return "jr\t%0";
+  }
+  [(set_attr "type"	"jump")
+   (set_attr "mode"	"none")])
+
+;; Exception return.
+(define_insn "loongarch_ertn"
+  [(return)
+   (unspec_volatile [(const_int 0)] UNSPECV_ERTN)]
+  ""
+  "ertn"
+  [(set_attr "type"	"trap")
+   (set_attr "mode"	"none")])
+
+;; This is used in compiling the unwind routines.
+(define_expand "eh_return"
+  [(use (match_operand 0 "general_operand"))]
+  ""
+{
+  if (GET_MODE (operands[0]) != word_mode)
+    operands[0] = convert_to_mode (word_mode, operands[0], 0);
+  if (TARGET_64BIT)
+    emit_insn (gen_eh_set_ra_di (operands[0]));
+  else
+    emit_insn (gen_eh_set_ra_si (operands[0]));
+  DONE;
+})
+
+;; Clobber the return address on the stack.  We can't expand this
+;; until we know where it will be put in the stack frame.
+
+(define_insn "eh_set_ra_si"
+  [(unspec [(match_operand:SI 0 "register_operand" "r")] UNSPEC_EH_RETURN)
+   (clobber (match_scratch:SI 1 "=&r"))]
+  "! TARGET_64BIT"
+  "#")
+
+(define_insn "eh_set_ra_di"
+  [(unspec [(match_operand:DI 0 "register_operand" "r")] UNSPEC_EH_RETURN)
+   (clobber (match_scratch:DI 1 "=&r"))]
+  "TARGET_64BIT"
+  "#")
+
+(define_split
+  [(unspec [(match_operand 0 "register_operand")] UNSPEC_EH_RETURN)
+   (clobber (match_scratch 1))]
+  "reload_completed"
+  [(const_int 0)]
+{
+  loongarch_set_return_address (operands[0], operands[1]);
+  DONE;
+})
+
+
+
+;;
+;;  ....................
+;;
+;;	FUNCTION CALLS
+;;
+;;  ....................
+
+;; Sibling calls.  All these patterns use jump instructions.
+
+(define_expand "sibcall"
+  [(parallel [(call (match_operand 0 "")
+		    (match_operand 1 ""))
+	      (use (match_operand 2 ""))	;; next_arg_reg
+	      (use (match_operand 3 ""))])]	;; struct_value_size_rtx
+  ""
+{
+  rtx target = loongarch_legitimize_call_address (XEXP (operands[0], 0));
+
+  emit_call_insn (gen_sibcall_internal (target, operands[1]));
+  DONE;
+})
+
+(define_insn "sibcall_internal"
+  [(call (mem:SI (match_operand 0 "call_insn_operand" "j,c,a,t,h"))
+	 (match_operand 1 "" ""))]
+  "SIBLING_CALL_P (insn)"
+{
+  switch (which_alternative)
+    {
+    case 0:
+      return "jr\t%0";
+    case 1:
+      if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r12,(%%pcrel(%0+0x20000))>>18\n\t"
+	       "jirl\t$r0,$r12,%%pcrel(%0+4)-(%%pcrel(%0+4+0x20000)>>18<<18)";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.local\t$r12,$r13,%0\n\tjr\t$r12";
+      else
+	return "b\t%0";
+    case 2:
+      if (TARGET_CMODEL_TINY_STATIC)
+	return "b\t%0";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r12,$r13,%0\n\tjr\t$r12";
+      else
+	return "la.global\t$r12,%0\n\tjr\t$r12";
+    case 3:
+      if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r12,$r13,%0\n\tjr\t$r12";
+      else
+	return "la.global\t$r12,%0\n\tjr\t$r12";
+    case 4:
+      if (TARGET_CMODEL_NORMAL || TARGET_CMODEL_TINY)
+	return "b\t%%plt(%0)";
+      else if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r12,(%%plt(%0)+0x20000)>>18\n\t"
+	       "jirl\t$r0,$r12,%%plt(%0)+4-((%%plt(%0)+(4+0x20000))>>18<<18)";
+      else
+	{
+	  sorry ("cmodel extreme and tiny static not support plt");
+	  return "";  /* GCC complains about may fall through.  */
+	}
+    default:
+      gcc_unreachable ();
+    }
+}
+  [(set_attr "jirl" "indirect,direct,direct,direct,direct")])
+
+(define_expand "sibcall_value"
+  [(parallel [(set (match_operand 0 "")
+		   (call (match_operand 1 "")
+			 (match_operand 2 "")))
+	      (use (match_operand 3 ""))])]		;; next_arg_reg
+  ""
+{
+  rtx target = loongarch_legitimize_call_address (XEXP (operands[1], 0));
+
+ /*  Handle return values created by loongarch_pass_fpr_pair.  */
+  if (GET_CODE (operands[0]) == PARALLEL && XVECLEN (operands[0], 0) == 2)
+    {
+      rtx arg1 = XEXP (XVECEXP (operands[0],0, 0), 0);
+      rtx arg2 = XEXP (XVECEXP (operands[0],0, 1), 0);
+
+      emit_call_insn (gen_sibcall_value_multiple_internal (arg1, target,
+							   operands[2],
+							   arg2));
+    }
+   else
+    {
+      /*  Handle return values created by loongarch_return_fpr_single.  */
+      if (GET_CODE (operands[0]) == PARALLEL && XVECLEN (operands[0], 0) == 1)
+      operands[0] = XEXP (XVECEXP (operands[0], 0, 0), 0);
+
+      emit_call_insn (gen_sibcall_value_internal (operands[0], target,
+						  operands[2]));
+    }
+  DONE;
+})
+
+(define_insn "sibcall_value_internal"
+  [(set (match_operand 0 "register_operand" "")
+	(call (mem:SI (match_operand 1 "call_insn_operand" "j,c,a,t,h"))
+	      (match_operand 2 "" "")))]
+  "SIBLING_CALL_P (insn)"
+{
+  switch (which_alternative)
+  {
+    case 0:
+      return "jr\t%1";
+    case 1:
+      if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r12,%%pcrel(%1+0x20000)>>18\n\t"
+	       "jirl\t$r0,$r12,%%pcrel(%1+4)-((%%pcrel(%1+4+0x20000))>>18<<18)";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.local\t$r12,$r13,%1\n\tjr\t$r12";
+      else
+	return "b\t%1";
+    case 2:
+      if (TARGET_CMODEL_TINY_STATIC)
+	return "b\t%1";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r12,$r13,%1\n\tjr\t$r12";
+      else
+	return "la.global\t$r12,%1\n\tjr\t$r12";
+    case 3:
+      if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r12,$r13,%1\n\tjr\t$r12";
+      else
+	return "la.global\t$r12,%1\n\tjr\t$r12";
+    case 4:
+      if (TARGET_CMODEL_NORMAL || TARGET_CMODEL_TINY)
+	return " b\t%%plt(%1)";
+      else if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r12,(%%plt(%1)+0x20000)>>18\n\t"
+	       "jirl\t$r0,$r12,%%plt(%1)+4-((%%plt(%1)+(4+0x20000))>>18<<18)";
+      else
+	{
+	  sorry ("loongarch cmodel extreme and tiny-static not support plt");
+	  return "";  /* GCC complains about may fall through.  */
+	}
+    default:
+      gcc_unreachable ();
+  }
+}
+  [(set_attr "jirl" "indirect,direct,direct,direct,direct")])
+
+(define_insn "sibcall_value_multiple_internal"
+  [(set (match_operand 0 "register_operand" "")
+	(call (mem:SI (match_operand 1 "call_insn_operand" "j,c,a,t,h"))
+	      (match_operand 2 "" "")))
+   (set (match_operand 3 "register_operand" "")
+	(call (mem:SI (match_dup 1))
+	      (match_dup 2)))]
+  "SIBLING_CALL_P (insn)"
+{
+  switch (which_alternative)
+  {
+    case 0:
+      return "jr\t%1";
+    case 1:
+      if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r12,%%pcrel(%1+0x20000)>>18\n\t"
+	       "jirl\t$r0,$r12,%%pcrel(%1+4)-(%%pcrel(%1+4+0x20000)>>18<<18)";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.local\t$r12,$r13,%1\n\tjr\t$r12";
+      else
+	return "b\t%1";
+    case 2:
+      if (TARGET_CMODEL_TINY_STATIC)
+	return "b\t%1";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r12,$r13,%1\n\tjr\t$r12";
+      else
+	return "la.global\t$r12,%1\n\tjr\t$r12";
+    case 3:
+      if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r12,$r13,%1\n\tjr\t$r12";
+      else
+	return "la.global\t$r12,%1\n\tjr\t$r12";
+    case 4:
+      if (TARGET_CMODEL_NORMAL || TARGET_CMODEL_TINY)
+	return "b\t%%plt(%1)";
+      else if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r12,(%%plt(%1)+0x20000)>>18\n\t"
+	       "jirl\t$r0,$r12,%%plt(%1)+4-((%%plt(%1)+(4+0x20000))>>18<<18)";
+      else
+	{
+	  sorry ("loongarch cmodel extreme and tiny-static not support plt");
+	  return "";  /* GCC complains about may fall through.  */
+	}
+    default:
+      gcc_unreachable ();
+  }
+}
+  [(set_attr "jirl" "indirect,direct,direct,direct,direct")])
+
+(define_expand "call"
+  [(parallel [(call (match_operand 0 "")
+		    (match_operand 1 ""))
+	      (use (match_operand 2 ""))	;; next_arg_reg
+	      (use (match_operand 3 ""))])]	;; struct_value_size_rtx
+  ""
+{
+  rtx target = loongarch_legitimize_call_address (XEXP (operands[0], 0));
+
+  emit_call_insn (gen_call_internal (target, operands[1]));
+  DONE;
+})
+
+(define_insn "call_internal"
+  [(call (mem:SI (match_operand 0 "call_insn_operand" "e,c,a,t,h"))
+	 (match_operand 1 "" ""))
+   (clobber (reg:SI RETURN_ADDR_REGNUM))]
+  ""
+{
+  switch (which_alternative)
+    {
+    case 0:
+      return "jirl\t$r1,%0,0";
+    case 1:
+      if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r1,%%pcrel(%0+0x20000)>>18\n\t"
+	       "jirl\t$r1,$r1,%%pcrel(%0+4)-(%%pcrel(%0+4+0x20000)>>18<<18)";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.local\t$r1,$r12,%0\n\tjirl\t$r1,$r1,0";
+      else
+	return "bl\t%0";
+    case 2:
+      if (TARGET_CMODEL_TINY_STATIC)
+	return "bl\t%0";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r1,$r12,%0\n\tjirl\t$r1,$r1,0";
+      else
+	return "la.global\t$r1,%0\n\tjirl\t$r1,$r1,0";
+    case 3:
+      if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r1,$r12,%0\n\tjirl\t$r1,$r1,0";
+      else
+	return "la.global\t$r1,%0\n\tjirl\t$r1,$r1,0";
+    case 4:
+      if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r1,(%%plt(%0)+0x20000)>>18\n\t"
+	       "jirl\t$r1,$r1,%%plt(%0)+4-((%%plt(%0)+(4+0x20000))>>18<<18)";
+      else if (TARGET_CMODEL_NORMAL || TARGET_CMODEL_TINY)
+	return "bl\t%%plt(%0)";
+      else
+	{
+	  sorry ("cmodel extreme and tiny-static not support plt");
+	  return "";  /* GCC complains about may fall through.  */
+	}
+    default:
+      gcc_unreachable ();
+    }
+}
+  [(set_attr "jirl" "indirect,direct,direct,direct,direct")
+   (set_attr "insn_count" "1,2,3,3,2")])
+
+(define_expand "call_value"
+  [(parallel [(set (match_operand 0 "")
+		   (call (match_operand 1 "")
+			 (match_operand 2 "")))
+	      (use (match_operand 3 ""))])]		;; next_arg_reg
+  ""
+{
+  rtx target = loongarch_legitimize_call_address (XEXP (operands[1], 0));
+ /*  Handle return values created by loongarch_pass_fpr_pair.  */
+  if (GET_CODE (operands[0]) == PARALLEL && XVECLEN (operands[0], 0) == 2)
+    {
+      rtx arg1 = XEXP (XVECEXP (operands[0], 0, 0), 0);
+      rtx arg2 = XEXP (XVECEXP (operands[0], 0, 1), 0);
+
+      emit_call_insn (gen_call_value_multiple_internal (arg1, target,
+							operands[2], arg2));
+    }
+   else
+    {
+      /*  Handle return values created by loongarch_return_fpr_single.  */
+      if (GET_CODE (operands[0]) == PARALLEL && XVECLEN (operands[0], 0) == 1)
+      operands[0] = XEXP (XVECEXP (operands[0], 0, 0), 0);
+
+      emit_call_insn (gen_call_value_internal (operands[0], target,
+					       operands[2]));
+    }
+  DONE;
+})
+
+(define_insn "call_value_internal"
+  [(set (match_operand 0 "register_operand" "")
+	(call (mem:SI (match_operand 1 "call_insn_operand" "e,c,a,t,h"))
+	      (match_operand 2 "" "")))
+   (clobber (reg:SI RETURN_ADDR_REGNUM))]
+  ""
+{
+  switch (which_alternative)
+    {
+    case 0:
+      return "jirl\t$r1,%1,0";
+    case 1:
+      if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r1,%%pcrel(%1+0x20000)>>18\n\t"
+	       "jirl\t$r1,$r1,%%pcrel(%1+4)-(%%pcrel(%1+4+0x20000)>>18<<18)";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.local\t$r1,$r12,%1\n\tjirl\t$r1,$r1,0";
+      else
+	return "bl\t%1";
+    case 2:
+      if (TARGET_CMODEL_TINY_STATIC)
+	return "bl\t%1";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r1,$r12,%1\n\tjirl\t$r1,$r1,0";
+      else
+	return "la.global\t$r1,%1\n\tjirl\t$r1,$r1,0";
+    case 3:
+      if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r1,$r12,%1\n\tjirl\t$r1,$r1,0";
+      else
+	return "la.global\t$r1,%1\n\tjirl\t$r1,$r1,0";
+    case 4:
+      if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r1,(%%plt(%1)+0x20000)>>18\n\t"
+	       "jirl\t$r1,$r1,%%plt(%1)+4-((%%plt(%1)+(4+0x20000))>>18<<18)";
+      else if (TARGET_CMODEL_NORMAL || TARGET_CMODEL_TINY)
+	return "bl\t%%plt(%1)";
+      else
+	{
+	  sorry ("loongarch cmodel extreme and tiny-static not support plt");
+	  return "";  /* GCC complains about may fall through.  */
+	}
+    default:
+      gcc_unreachable ();
+    }
+}
+  [(set_attr "jirl" "indirect,direct,direct,direct,direct")
+   (set_attr "insn_count" "1,2,3,3,2")])
+
+(define_insn "call_value_multiple_internal"
+  [(set (match_operand 0 "register_operand" "")
+	(call (mem:SI (match_operand 1 "call_insn_operand" "e,c,a,t,h"))
+	      (match_operand 2 "" "")))
+   (set (match_operand 3 "register_operand" "")
+	(call (mem:SI (match_dup 1))
+	      (match_dup 2)))
+   (clobber (reg:SI RETURN_ADDR_REGNUM))]
+  ""
+{
+  switch (which_alternative)
+    {
+    case 0:
+      return "jirl\t$r1,%1,0";
+    case 1:
+      if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r1,%%pcrel(%1+0x20000)>>18\n\t"
+	       "jirl\t$r1,$r1,%%pcrel(%1+4)-(%%pcrel(%1+4+0x20000)>>18<<18)";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.local\t$r1,$r12,%1\n\tjirl\t$r1,$r1,0";
+      else
+	return "bl\t%1";
+    case 2:
+      if (TARGET_CMODEL_TINY_STATIC)
+	return "bl\t%1";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r1,$r12,%1\n\tjirl\t$r1,$r1,0 ";
+      else
+	return "la.global\t$r1,%1\n\tjirl\t$r1,$r1,0";
+    case 3:
+      if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r1,$r12,%1\n\tjirl\t$r1,$r1,0";
+      else
+	return "la.global\t$r1,%1\n\tjirl\t$r1,$r1,0";
+    case 4:
+      if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r1,(%%plt(%1)+0x20000)>>18\n\t"
+	       "jirl\t$r1,$r1,%%plt(%1)+4-((%%plt(%1)+(4+0x20000))>>18<<18)";
+      else if (TARGET_CMODEL_NORMAL || TARGET_CMODEL_TINY)
+	return "bl\t%%plt(%1)";
+      else
+	{
+	  sorry ("loongarch cmodel extreme and tiny-static not support plt");
+	  return "";  /* GCC complains about may fall through.  */
+	}
+    default:
+      gcc_unreachable ();
+    }
+}
+  [(set_attr "jirl" "indirect,direct,direct,direct,direct")
+   (set_attr "insn_count" "1,2,3,3,2")])
+
+
+;; Call subroutine returning any type.
+(define_expand "untyped_call"
+  [(parallel [(call (match_operand 0 "")
+		    (const_int 0))
+	      (match_operand 1 "")
+	      (match_operand 2 "")])]
+  ""
+{
+  int i;
+
+  emit_call_insn (gen_call (operands[0], const0_rtx, NULL, const0_rtx));
+
+  for (i = 0; i < XVECLEN (operands[2], 0); i++)
+    {
+      rtx set = XVECEXP (operands[2], 0, i);
+      loongarch_emit_move (SET_DEST (set), SET_SRC (set));
+    }
+
+  emit_insn (gen_blockage ());
+  DONE;
+})
+
+;;
+;;  ....................
+;;
+;;	MISC.
+;;
+;;  ....................
+;;
+
+(define_insn "nop"
+  [(const_int 0)]
+  ""
+  "nop"
+  [(set_attr "type"	"nop")
+   (set_attr "mode"	"none")])
+
+;; __builtin_loongarch_movfcsr2gr: move the FCSR into operand 0.
+(define_insn "loongarch_movfcsr2gr"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+    (unspec_volatile:SI [(match_operand 1 "const_uimm5_operand")]
+    UNSPECV_MOVFCSR2GR))]
+  "TARGET_HARD_FLOAT"
+  "movfcsr2gr\t%0,$r%1")
+
+;; __builtin_loongarch_movgr2fcsr: move operand 0 into the FCSR.
+(define_insn "loongarch_movgr2fcsr"
+  [(unspec_volatile [(match_operand 0 "const_uimm5_operand")
+		     (match_operand:SI 1 "register_operand" "r")]
+	  UNSPECV_MOVGR2FCSR)]
+  "TARGET_HARD_FLOAT"
+  "movgr2fcsr\t$r%0,%1")
+
+(define_insn "fclass_<fmt>"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+       (unspec:ANYF [(match_operand:ANYF 1 "register_operand" "f")]
+		UNSPEC_FCLASS))]
+  "TARGET_HARD_FLOAT"
+  "fclass.<fmt>\t%0,%1"
+  [(set_attr "type" "unknown")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "bytepick_w"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(unspec:SI [(match_operand:SI 1 "register_operand" "r")
+		   (match_operand:SI 2 "register_operand" "r")
+		   (match_operand:SI 3 "const_0_to_3_operand" "n")]
+	      UNSPEC_BYTEPICK_W))]
+  ""
+  "bytepick.w\t%0,%1,%2,%z3"
+  [(set_attr "mode"    "SI")])
+
+(define_insn "bytepick_d"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(unspec:DI [(match_operand:DI 1 "register_operand" "r")
+		   (match_operand:DI 2 "register_operand" "r")
+		   (match_operand:DI 3 "const_0_to_7_operand" "n")]
+	      UNSPEC_BYTEPICK_D))]
+  ""
+  "bytepick.d\t%0,%1,%2,%z3"
+  [(set_attr "mode"    "DI")])
+
+(define_insn "bitrev_4b"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(unspec:SI [(match_operand:SI 1 "register_operand" "r")]
+	    UNSPEC_BITREV_4B))]
+  ""
+  "bitrev.4b\t%0,%1"
+  [(set_attr "type"    "unknown")
+   (set_attr "mode"    "SI")])
+
+(define_insn "bitrev_8b"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(unspec:DI [(match_operand:DI 1 "register_operand" "r")]
+	    UNSPEC_BITREV_8B))]
+  ""
+  "bitrev.8b\t%0,%1"
+  [(set_attr "type"    "unknown")
+   (set_attr "mode"    "DI")])
+
+(define_insn "stack_tie<mode>"
+  [(set (mem:BLK (scratch))
+	(unspec:BLK [(match_operand:GPR 0 "register_operand" "r")
+		     (match_operand:GPR 1 "register_operand" "r")]
+		    UNSPEC_TIE))]
+  ""
+  ""
+  [(set_attr "length" "0")]
+)
+
+(define_insn "gpr_restore_return"
+  [(return)
+   (use (match_operand 0 "pmode_register_operand" ""))
+   (const_int 0)]
+  ""
+  "")
+
+(define_split
+  [(match_operand 0 "small_data_pattern")]
+  "reload_completed"
+  [(match_dup 0)]
+  { operands[0] = loongarch_rewrite_small_data (operands[0]); })
+
+
+;; Match paired HI/SI/SF/DFmode load/stores.
+(define_insn "*join2_load_store<JOIN_MODE:mode>"
+  [(set (match_operand:JOIN_MODE 0 "nonimmediate_operand"
+  "=r,f,m,m,r,ZC,r,k,f,k")
+	(match_operand:JOIN_MODE 1 "nonimmediate_operand" "m,m,r,f,ZC,r,k,r,k,f"))
+   (set (match_operand:JOIN_MODE 2 "nonimmediate_operand"
+   "=r,f,m,m,r,ZC,r,k,f,k")
+	(match_operand:JOIN_MODE 3 "nonimmediate_operand" "m,m,r,f,ZC,r,k,r,k,f"))]
+  "reload_completed"
+  {
+    bool load_p = (which_alternative == 0 || which_alternative == 1);
+    /* Reg-renaming pass reuses base register if it is dead after bonded loads.
+       Hardware does not bond those loads, even when they are consecutive.
+       However, order of the loads need to be checked for correctness.  */
+    if (!load_p || !reg_overlap_mentioned_p (operands[0], operands[1]))
+      {
+	output_asm_insn (loongarch_output_move (operands[0], operands[1]),
+			 operands);
+	output_asm_insn (loongarch_output_move (operands[2], operands[3]),
+			 &operands[2]);
+      }
+    else
+      {
+	output_asm_insn (loongarch_output_move (operands[2], operands[3]),
+			 &operands[2]);
+	output_asm_insn (loongarch_output_move (operands[0], operands[1]),
+			 operands);
+      }
+    return "";
+  }
+  [(set_attr "move_type"
+  "load,fpload,store,fpstore,load,store,load,store,fpload,fpstore")
+   (set_attr "insn_count" "2,2,2,2,2,2,2,2,2,2")])
+
+;; 2 HI/SI/SF/DF loads are bonded.
+(define_peephole2
+  [(set (match_operand:JOIN_MODE 0 "register_operand")
+	(match_operand:JOIN_MODE 1 "non_volatile_mem_operand"))
+   (set (match_operand:JOIN_MODE 2 "register_operand")
+	(match_operand:JOIN_MODE 3 "non_volatile_mem_operand"))]
+  "loongarch_load_store_bonding_p (operands, <JOIN_MODE:MODE>mode, true)"
+  [(parallel [(set (match_dup 0)
+		   (match_dup 1))
+	      (set (match_dup 2)
+		   (match_dup 3))])]
+  "")
+
+;; 2 HI/SI/SF/DF stores are bonded.
+(define_peephole2
+  [(set (match_operand:JOIN_MODE 0 "memory_operand")
+	(match_operand:JOIN_MODE 1 "register_operand"))
+   (set (match_operand:JOIN_MODE 2 "memory_operand")
+	(match_operand:JOIN_MODE 3 "register_operand"))]
+  "loongarch_load_store_bonding_p (operands, <JOIN_MODE:MODE>mode, false)"
+  [(parallel [(set (match_dup 0)
+		   (match_dup 1))
+	      (set (match_dup 2)
+		   (match_dup 3))])]
+  "")
+
+;; Match paired HImode loads.
+(define_insn "*join2_loadhi"
+  [(set (match_operand:SI 0 "register_operand" "=r,r")
+	(any_extend:SI (match_operand:HI 1 "non_volatile_mem_operand" "m,k")))
+   (set (match_operand:SI 2 "register_operand" "=r,r")
+	(any_extend:SI (match_operand:HI 3 "non_volatile_mem_operand" "m,k")))]
+  "reload_completed"
+  {
+    /* Reg-renaming pass reuses base register if it is dead after bonded loads.
+       Hardware does not bond those loads, even when they are consecutive.
+       However, order of the loads need to be checked for correctness.  */
+    if (!reg_overlap_mentioned_p (operands[0], operands[1]))
+      {
+	output_asm_insn ("ld.h<u>\t%0,%1", operands);
+	output_asm_insn ("ld.h<u>\t%2,%3", operands);
+      }
+    else
+      {
+	output_asm_insn ("ld.h<u>\t%2,%3", operands);
+	output_asm_insn ("ld.h<u>\t%0,%1", operands);
+      }
+
+    return "";
+  }
+  [(set_attr "move_type" "load,load")
+   (set_attr "insn_count" "2,2")])
+
+
+;; 2 HI loads are bonded.
+(define_peephole2
+  [(set (match_operand:SI 0 "register_operand")
+	(any_extend:SI (match_operand:HI 1 "non_volatile_mem_operand")))
+   (set (match_operand:SI 2 "register_operand")
+	(any_extend:SI (match_operand:HI 3 "non_volatile_mem_operand")))]
+  "loongarch_load_store_bonding_p (operands, HImode, true)"
+  [(parallel [(set (match_dup 0)
+		   (any_extend:SI (match_dup 1)))
+	      (set (match_dup 2)
+		   (any_extend:SI (match_dup 3)))])]
+  "")
+
+
+
+(define_mode_iterator QHSD [QI HI SI DI])
+
+(define_insn "crc_w_<size>_w"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(unspec:SI [(match_operand:QHSD 1 "register_operand" "r")
+		   (match_operand:SI 2 "register_operand" "r")]
+		     UNSPEC_CRC))]
+  ""
+  "crc.w.<size>.w\t%0,%1,%2"
+  [(set_attr "type" "unknown")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "crcc_w_<size>_w"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(unspec:SI [(match_operand:QHSD 1 "register_operand" "r")
+		   (match_operand:SI 2 "register_operand" "r")]
+		     UNSPEC_CRCC))]
+  ""
+  "crcc.w.<size>.w\t%0,%1,%2"
+  [(set_attr "type" "unknown")
+   (set_attr "mode" "<MODE>")])
+
+;; Synchronization instructions.
+
+(include "sync.md")
+
+(include "generic.md")
+(include "la464.md")
+
+(define_c_enum "unspec" [
+  UNSPEC_ADDRESS_FIRST
+])
diff --git a/gcc/config/loongarch/predicates.md b/gcc/config/loongarch/predicates.md
new file mode 100644
index 00000000000..ec5f74f4454
--- /dev/null
+++ b/gcc/config/loongarch/predicates.md
@@ -0,0 +1,527 @@ 
+;; Predicate definitions for LoongArch target.
+;; Copyright (C) 2021-2022 Free Software Foundation, Inc.
+;; Contributed by Loongson Ltd.
+;; Based on MIPS target for GNU compiler.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_predicate "const_uns_arith_operand"
+  (and (match_code "const_int")
+       (match_test "IMM12_OPERAND_UNSIGNED (INTVAL (op))")))
+
+(define_predicate "uns_arith_operand"
+  (ior (match_operand 0 "const_uns_arith_operand")
+       (match_operand 0 "register_operand")))
+
+(define_predicate "const_lu32i_operand"
+  (and (match_code "const_int")
+       (match_test "LU32I_OPERAND (INTVAL (op))")))
+
+(define_predicate "const_lu52i_operand"
+  (and (match_code "const_int")
+       (match_test "LU52I_OPERAND (INTVAL (op))")))
+
+(define_predicate "const_arith_operand"
+  (and (match_code "const_int")
+       (match_test "IMM12_OPERAND (INTVAL (op))")))
+
+(define_predicate "const_imm16_operand"
+  (and (match_code "const_int")
+       (match_test "IMM16_OPERAND (INTVAL (op))")))
+
+(define_predicate "arith_operand"
+  (ior (match_operand 0 "const_arith_operand")
+       (match_operand 0 "register_operand")))
+
+(define_predicate "const_immalsl_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 1, 4)")))
+
+(define_predicate "const_uimm3_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
+
+(define_predicate "const_uimm4_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 15)")))
+
+(define_predicate "const_uimm5_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 31)")))
+
+(define_predicate "const_uimm6_operand"
+  (and (match_code "const_int")
+       (match_test "UIMM6_OPERAND (INTVAL (op))")))
+
+(define_predicate "const_uimm7_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 127)")))
+
+(define_predicate "const_uimm8_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 255)")))
+
+(define_predicate "const_uimm14_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 16383)")))
+
+(define_predicate "const_uimm15_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 32767)")))
+
+(define_predicate "const_imm5_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), -16, 15)")))
+
+(define_predicate "const_imm10_operand"
+  (and (match_code "const_int")
+       (match_test "IMM10_OPERAND (INTVAL (op))")))
+
+(define_predicate "const_imm12_operand"
+  (and (match_code "const_int")
+       (match_test "IMM12_OPERAND (INTVAL (op))")))
+
+(define_predicate "reg_imm10_operand"
+  (ior (match_operand 0 "const_imm10_operand")
+       (match_operand 0 "register_operand")))
+
+(define_predicate "aq8b_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 0)")))
+
+(define_predicate "aq8h_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 1)")))
+
+(define_predicate "aq8w_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 2)")))
+
+(define_predicate "aq8d_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 3)")))
+
+(define_predicate "aq10b_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 0)")))
+
+(define_predicate "aq10h_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 1)")))
+
+(define_predicate "aq10w_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 2)")))
+
+(define_predicate "aq10d_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 3)")))
+
+(define_predicate "aq12b_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 12, 0)")))
+
+(define_predicate "aq12h_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 11, 1)")))
+
+(define_predicate "aq12w_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 2)")))
+
+(define_predicate "aq12d_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 9, 3)")))
+
+(define_predicate "sle_operand"
+  (and (match_code "const_int")
+       (match_test "IMM12_OPERAND (INTVAL (op) + 1)")))
+
+(define_predicate "sleu_operand"
+  (and (match_operand 0 "sle_operand")
+       (match_test "INTVAL (op) + 1 != 0")))
+
+(define_predicate "const_0_operand"
+  (and (match_code "const_int,const_double,const_vector")
+       (match_test "op == CONST0_RTX (GET_MODE (op))")))
+
+(define_predicate "const_m1_operand"
+  (and (match_code "const_int,const_double,const_vector")
+       (match_test "op == CONSTM1_RTX (GET_MODE (op))")))
+
+(define_predicate "reg_or_m1_operand"
+  (ior (match_operand 0 "const_m1_operand")
+       (match_operand 0 "register_operand")))
+
+(define_predicate "reg_or_0_operand"
+  (ior (match_operand 0 "const_0_operand")
+       (match_operand 0 "register_operand")))
+
+(define_predicate "const_1_operand"
+  (and (match_code "const_int,const_double,const_vector")
+       (match_test "op == CONST1_RTX (GET_MODE (op))")))
+
+(define_predicate "reg_or_1_operand"
+  (ior (match_operand 0 "const_1_operand")
+       (match_operand 0 "register_operand")))
+
+;; These are used in vec_merge, hence accept bitmask as const_int.
+(define_predicate "const_exp_2_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 1)")))
+
+(define_predicate "const_exp_4_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 3)")))
+
+(define_predicate "const_exp_8_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 7)")))
+
+(define_predicate "const_exp_16_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 15)")))
+
+(define_predicate "const_exp_32_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 31)")))
+
+;; This is used for indexing into vectors, and hence only accepts const_int.
+(define_predicate "const_0_or_1_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 1)")))
+
+(define_predicate "const_2_or_3_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 2, 3)")))
+
+(define_predicate "const_0_to_3_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 3)")))
+
+(define_predicate "const_0_to_7_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
+
+(define_predicate "const_4_to_7_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 4, 7)")))
+
+(define_predicate "const_8_to_15_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
+
+(define_predicate "const_16_to_31_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
+
+(define_predicate "qi_mask_operand"
+  (and (match_code "const_int")
+       (match_test "UINTVAL (op) == 0xff")))
+
+(define_predicate "hi_mask_operand"
+  (and (match_code "const_int")
+       (match_test "UINTVAL (op) == 0xffff")))
+
+(define_predicate "lu52i_mask_operand"
+  (and (match_code "const_int")
+       (match_test "UINTVAL (op) == 0xfffffffffffff")))
+
+(define_predicate "si_mask_operand"
+  (and (match_code "const_int")
+       (match_test "UINTVAL (op) == 0xffffffff")))
+
+(define_predicate "low_bitmask_operand"
+  (and (match_code "const_int")
+       (match_test "low_bitmask_len (mode, INTVAL (op)) > 12")))
+
+(define_predicate "d_operand"
+  (and (match_code "reg")
+       (match_test "GP_REG_P (REGNO (op))")))
+
+(define_predicate "db4_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op) + 1, 4, 0)")))
+
+(define_predicate "db7_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op) + 1, 7, 0)")))
+
+(define_predicate "db8_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op) + 1, 8, 0)")))
+
+(define_predicate "ib3_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op) - 1, 3, 0)")))
+
+(define_predicate "sb4_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 4, 0)")))
+
+(define_predicate "sb5_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 5, 0)")))
+
+(define_predicate "sb8_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 0)")))
+
+(define_predicate "sd8_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 3)")))
+
+(define_predicate "ub4_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 4, 0)")))
+
+(define_predicate "ub8_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 8, 0)")))
+
+(define_predicate "uh4_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 4, 1)")))
+
+(define_predicate "uw4_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 4, 2)")))
+
+(define_predicate "uw5_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 5, 2)")))
+
+(define_predicate "uw6_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 6, 2)")))
+
+(define_predicate "uw8_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 8, 2)")))
+
+(define_predicate "addiur2_operand"
+  (and (match_code "const_int")
+	(ior (match_test "INTVAL (op) == -1")
+	     (match_test "INTVAL (op) == 1")
+	     (match_test "INTVAL (op) == 4")
+	     (match_test "INTVAL (op) == 8")
+	     (match_test "INTVAL (op) == 12")
+	     (match_test "INTVAL (op) == 16")
+	     (match_test "INTVAL (op) == 20")
+	     (match_test "INTVAL (op) == 24"))))
+
+(define_predicate "addiusp_operand"
+  (and (match_code "const_int")
+       (ior (match_test "(IN_RANGE (INTVAL (op), 2, 257))")
+	    (match_test "(IN_RANGE (INTVAL (op), -258, -3))"))))
+
+(define_predicate "andi16_operand"
+  (and (match_code "const_int")
+	(ior (match_test "IN_RANGE (INTVAL (op), 1, 4)")
+	     (match_test "IN_RANGE (INTVAL (op), 7, 8)")
+	     (match_test "IN_RANGE (INTVAL (op), 15, 16)")
+	     (match_test "IN_RANGE (INTVAL (op), 31, 32)")
+	     (match_test "IN_RANGE (INTVAL (op), 63, 64)")
+	     (match_test "INTVAL (op) == 255")
+	     (match_test "INTVAL (op) == 32768")
+	     (match_test "INTVAL (op) == 65535"))))
+
+(define_predicate "movep_src_register"
+  (and (match_code "reg")
+       (ior (match_test ("IN_RANGE (REGNO (op), 2, 3)"))
+	    (match_test ("IN_RANGE (REGNO (op), 16, 20)")))))
+
+(define_predicate "movep_src_operand"
+  (ior (match_operand 0 "const_0_operand")
+       (match_operand 0 "movep_src_register")))
+
+(define_predicate "fcc_reload_operand"
+  (and (match_code "reg,subreg")
+       (match_test "FCC_REG_P (true_regnum (op))")))
+
+(define_predicate "muldiv_target_operand"
+		(match_operand 0 "register_operand"))
+
+(define_predicate "const_call_insn_operand"
+  (match_code "const,symbol_ref,label_ref")
+{
+  enum loongarch_symbol_type symbol_type;
+
+  if (!loongarch_symbolic_constant_p (op, SYMBOL_CONTEXT_CALL, &symbol_type))
+    return false;
+
+  switch (symbol_type)
+    {
+    case SYMBOL_GOT_DISP:
+      /* Without explicit relocs, there is no special syntax for
+	 loading the address of a call destination into a register.
+	 Using "la.global JIRL_REGS,foo; jirl JIRL_REGS" would prevent the lazy
+	 binding of "foo", so keep the address of global symbols with the jirl
+	 macro.  */
+      return 1;
+
+    default:
+      return false;
+    }
+})
+
+(define_predicate "call_insn_operand"
+  (ior (match_operand 0 "const_call_insn_operand")
+       (match_operand 0 "register_operand")))
+
+(define_predicate "is_const_call_local_symbol"
+  (and (match_operand 0 "const_call_insn_operand")
+       (ior (match_test "loongarch_global_symbol_p (op) == 0")
+       (match_test "loongarch_symbol_binds_local_p (op) != 0"))
+       (match_test "CONSTANT_P (op)")))
+
+(define_predicate "is_const_call_weak_symbol"
+  (and (match_operand 0 "const_call_insn_operand")
+       (not (match_operand 0 "is_const_call_local_symbol"))
+       (match_test "loongarch_weak_symbol_p (op) != 0")
+       (match_test "CONSTANT_P (op)")))
+
+(define_predicate "is_const_call_plt_symbol"
+  (and (match_operand 0 "const_call_insn_operand")
+       (match_test "flag_plt != 0")
+       (match_test "loongarch_global_symbol_noweak_p (op) != 0")
+       (match_test "CONSTANT_P (op)")))
+
+(define_predicate "is_const_call_global_noplt_symbol"
+  (and (match_operand 0 "const_call_insn_operand")
+       (match_test "flag_plt == 0")
+       (match_test "loongarch_global_symbol_noweak_p (op) != 0")
+       (match_test "CONSTANT_P (op)")))
+
+;; A legitimate CONST_INT operand that takes more than one instruction
+;; to load.
+(define_predicate "splittable_const_int_operand"
+  (match_code "const_int")
+{
+  /* Don't handle multi-word moves this way; we don't want to introduce
+     the individual word-mode moves until after reload.  */
+  if (GET_MODE_SIZE (mode) > UNITS_PER_WORD)
+    return false;
+
+  /* Otherwise check whether the constant can be loaded in a single
+     instruction.  */
+  return !LU12I_INT (op) && !IMM12_INT (op) && !IMM12_INT_UNSIGNED (op)
+	 && !LU52I_INT (op);
+})
+
+(define_predicate "move_operand"
+  (match_operand 0 "general_operand")
+{
+  enum loongarch_symbol_type symbol_type;
+
+  /* The thinking here is as follows:
+
+     (1) The move expanders should split complex load sequences into
+	 individual instructions.  Those individual instructions can
+	 then be optimized by all rtl passes.
+
+     (2) The target of pre-reload load sequences should not be used
+	 to store temporary results.  If the target register is only
+	 assigned one value, reload can rematerialize that value
+	 on demand, rather than spill it to the stack.
+
+     (3) If we allowed pre-reload passes like combine and cse to recreate
+	 complex load sequences, we would want to be able to split the
+	 sequences before reload as well, so that the pre-reload scheduler
+	 can see the individual instructions.  This falls foul of (2);
+	 the splitter would be forced to reuse the target register for
+	 intermediate results.
+
+     (4) We want to define complex load splitters for combine.  These
+	 splitters can request a temporary scratch register, which avoids
+	 the problem in (2).  They allow things like:
+
+	      (set (reg T1) (high SYM))
+	      (set (reg T2) (low (reg T1) SYM))
+	      (set (reg X) (plus (reg T2) (const_int OFFSET)))
+
+	 to be combined into:
+
+	      (set (reg T3) (high SYM+OFFSET))
+	      (set (reg X) (lo_sum (reg T3) SYM+OFFSET))
+
+	 if T2 is only used this once.  */
+  switch (GET_CODE (op))
+    {
+    case CONST_INT:
+      return !splittable_const_int_operand (op, mode);
+
+    case CONST:
+    case SYMBOL_REF:
+    case LABEL_REF:
+      return (loongarch_symbolic_constant_p (op, SYMBOL_CONTEXT_LEA,
+					     &symbol_type));
+    default:
+      return true;
+    }
+})
+
+(define_predicate "consttable_operand"
+  (match_test "CONSTANT_P (op)"))
+
+(define_predicate "symbolic_operand"
+  (match_code "const,symbol_ref,label_ref")
+{
+  enum loongarch_symbol_type type;
+  return loongarch_symbolic_constant_p (op, SYMBOL_CONTEXT_LEA, &type);
+})
+
+(define_predicate "got_disp_operand"
+  (match_code "const,symbol_ref,label_ref")
+{
+  enum loongarch_symbol_type type;
+  return (loongarch_symbolic_constant_p (op, SYMBOL_CONTEXT_LEA, &type)
+	  && type == SYMBOL_GOT_DISP);
+})
+
+(define_predicate "symbol_ref_operand"
+  (match_code "symbol_ref"))
+
+(define_predicate "equality_operator"
+  (match_code "eq,ne"))
+
+(define_predicate "extend_operator"
+  (match_code "zero_extend,sign_extend"))
+
+(define_predicate "trap_comparison_operator"
+  (match_code "eq,ne,lt,ltu,ge,geu"))
+
+(define_predicate "order_operator"
+  (match_code "lt,ltu,le,leu,ge,geu,gt,gtu"))
+
+;; For NE, cstore uses sltu instructions in which the first operand is $0.
+
+(define_predicate "loongarch_cstore_operator"
+  (match_code "ne,eq,gt,gtu,ge,geu,lt,ltu,le,leu"))
+
+(define_predicate "small_data_pattern"
+  (and (match_code "set,parallel,unspec,unspec_volatile,prefetch")
+       (match_test "loongarch_small_data_pattern_p (op)")))
+
+(define_predicate "mem_noofs_operand"
+  (and (match_code "mem")
+       (match_code "reg" "0")))
+
+;; Return 1 if the operand is in non-volatile memory.
+(define_predicate "non_volatile_mem_operand"
+  (and (match_operand 0 "memory_operand")
+       (not (match_test "MEM_VOLATILE_P (op)"))))
diff --git a/gcc/config/loongarch/sync.md b/gcc/config/loongarch/sync.md
new file mode 100644
index 00000000000..0c4f1983e88
--- /dev/null
+++ b/gcc/config/loongarch/sync.md
@@ -0,0 +1,574 @@ 
+;; Machine description for LoongArch atomic operations.
+;; Copyright (C) 2021-2022 Free Software Foundation, Inc.
+;; Contributed by Loongson Ltd.
+;; Based on MIPS and RISC-V target for GNU compiler.
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_c_enum "unspec" [
+  UNSPEC_COMPARE_AND_SWAP
+  UNSPEC_COMPARE_AND_SWAP_ADD
+  UNSPEC_COMPARE_AND_SWAP_SUB
+  UNSPEC_COMPARE_AND_SWAP_AND
+  UNSPEC_COMPARE_AND_SWAP_XOR
+  UNSPEC_COMPARE_AND_SWAP_OR
+  UNSPEC_COMPARE_AND_SWAP_NAND
+  UNSPEC_SYNC_OLD_OP
+  UNSPEC_SYNC_EXCHANGE
+  UNSPEC_ATOMIC_STORE
+  UNSPEC_MEMORY_BARRIER
+])
+
+(define_code_iterator any_atomic [plus ior xor and])
+(define_code_attr atomic_optab
+  [(plus "add") (ior "or") (xor "xor") (and "and")])
+
+;; This attribute gives the format suffix for atomic memory operations.
+(define_mode_attr amo [(SI "w") (DI "d")])
+
+;; <amop> expands to the name of the atomic operand that implements a
+;; particular code.
+(define_code_attr amop [(ior "or") (xor "xor") (and "and") (plus "add")])
+
+;; Memory barriers.
+
+(define_expand "mem_thread_fence"
+  [(match_operand:SI 0 "const_int_operand" "")] ;; model
+  ""
+{
+  if (INTVAL (operands[0]) != MEMMODEL_RELAXED)
+    {
+      rtx mem = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode));
+      MEM_VOLATILE_P (mem) = 1;
+      emit_insn (gen_mem_thread_fence_1 (mem, operands[0]));
+    }
+  DONE;
+})
+
+;; Until the LoongArch memory model (hence its mapping from C++) is finalized,
+;; conservatively emit a full FENCE.
+(define_insn "mem_thread_fence_1"
+  [(set (match_operand:BLK 0 "" "")
+	(unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))
+   (match_operand:SI 1 "const_int_operand" "")] ;; model
+  ""
+  "dbar\t0")
+
+;; Atomic memory operations.
+
+;; Implement atomic stores with amoswap.  Fall back to fences for atomic loads.
+(define_insn "atomic_store<mode>"
+  [(set (match_operand:GPR 0 "memory_operand" "+ZB")
+    (unspec_volatile:GPR
+      [(match_operand:GPR 1 "reg_or_0_operand" "rJ")
+       (match_operand:SI 2 "const_int_operand")]      ;; model
+      UNSPEC_ATOMIC_STORE))]
+  ""
+  "amswap%A2.<amo>\t$zero,%z1,%0"
+  [(set (attr "length") (const_int 8))])
+
+(define_insn "atomic_<atomic_optab><mode>"
+  [(set (match_operand:GPR 0 "memory_operand" "+ZB")
+	(unspec_volatile:GPR
+	  [(any_atomic:GPR (match_dup 0)
+			   (match_operand:GPR 1 "reg_or_0_operand" "rJ"))
+	   (match_operand:SI 2 "const_int_operand")] ;; model
+	 UNSPEC_SYNC_OLD_OP))]
+  ""
+  "am<amop>%A2.<amo>\t$zero,%z1,%0"
+  [(set (attr "length") (const_int 8))])
+
+(define_insn "atomic_fetch_<atomic_optab><mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&r")
+	(match_operand:GPR 1 "memory_operand" "+ZB"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR
+	  [(any_atomic:GPR (match_dup 1)
+		     (match_operand:GPR 2 "reg_or_0_operand" "rJ"))
+	   (match_operand:SI 3 "const_int_operand")] ;; model
+	 UNSPEC_SYNC_OLD_OP))]
+  ""
+  "am<amop>%A3.<amo>\t%0,%z2,%1"
+  [(set (attr "length") (const_int 8))])
+
+(define_insn "atomic_exchange<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&r")
+	(unspec_volatile:GPR
+	  [(match_operand:GPR 1 "memory_operand" "+ZB")
+	   (match_operand:SI 3 "const_int_operand")] ;; model
+	  UNSPEC_SYNC_EXCHANGE))
+   (set (match_dup 1)
+	(match_operand:GPR 2 "register_operand" "r"))]
+  ""
+  "amswap%A3.<amo>\t%0,%z2,%1"
+  [(set (attr "length") (const_int 8))])
+
+(define_insn "atomic_cas_value_strong<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&r")
+	(match_operand:GPR 1 "memory_operand" "+ZC"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR [(match_operand:GPR 2 "reg_or_0_operand" "rJ")
+			      (match_operand:GPR 3 "reg_or_0_operand" "rJ")
+			      (match_operand:SI 4 "const_int_operand")  ;; mod_s
+			      (match_operand:SI 5 "const_int_operand")] ;; mod_f
+	 UNSPEC_COMPARE_AND_SWAP))
+   (clobber (match_scratch:GPR 6 "=&r"))]
+  ""
+{
+  return "%G5\\n\\t"
+	 "1:\\n\\t"
+	 "ll.<amo>\\t%0,%1\\n\\t"
+	 "bne\\t%0,%z2,2f\\n\\t"
+	 "or%i3\\t%6,$zero,%3\\n\\t"
+	 "sc.<amo>\\t%6,%1\\n\\t"
+	 "beq\\t$zero,%6,1b\\n\\t"
+	 "b\\t3f\\n\\t"
+	 "2:\\n\\t"
+	 "dbar\\t0x700\\n\\t"
+	 "3:\\n\\t";
+}
+  [(set (attr "length") (const_int 32))])
+
+(define_expand "atomic_compare_and_swap<mode>"
+  [(match_operand:SI 0 "register_operand" "")   ;; bool output
+   (match_operand:GPR 1 "register_operand" "")  ;; val output
+   (match_operand:GPR 2 "memory_operand" "")    ;; memory
+   (match_operand:GPR 3 "reg_or_0_operand" "")  ;; expected value
+   (match_operand:GPR 4 "reg_or_0_operand" "")  ;; desired value
+   (match_operand:SI 5 "const_int_operand" "")  ;; is_weak
+   (match_operand:SI 6 "const_int_operand" "")  ;; mod_s
+   (match_operand:SI 7 "const_int_operand" "")] ;; mod_f
+  ""
+{
+  emit_insn (gen_atomic_cas_value_strong<mode> (operands[1], operands[2],
+						operands[3], operands[4],
+						operands[6], operands[7]));
+
+  rtx compare = operands[1];
+  if (operands[3] != const0_rtx)
+    {
+      rtx difference = gen_rtx_MINUS (<MODE>mode, operands[1], operands[3]);
+      compare = gen_reg_rtx (<MODE>mode);
+      emit_insn (gen_rtx_SET (compare, difference));
+    }
+
+  if (word_mode != <MODE>mode)
+    {
+      rtx reg = gen_reg_rtx (word_mode);
+      emit_insn (gen_rtx_SET (reg, gen_rtx_SIGN_EXTEND (word_mode, compare)));
+      compare = reg;
+    }
+
+  emit_insn (gen_rtx_SET (operands[0],
+			  gen_rtx_EQ (SImode, compare, const0_rtx)));
+  DONE;
+})
+
+(define_expand "atomic_test_and_set"
+  [(match_operand:QI 0 "register_operand" "")     ;; bool output
+   (match_operand:QI 1 "memory_operand" "+ZB")    ;; memory
+   (match_operand:SI 2 "const_int_operand" "")]   ;; model
+  ""
+{
+  /* We have no QImode atomics, so use the address LSBs to form a mask,
+     then use an aligned SImode atomic.  */
+  rtx result = operands[0];
+  rtx mem = operands[1];
+  rtx model = operands[2];
+  rtx addr = force_reg (Pmode, XEXP (mem, 0));
+  rtx tmp_reg = gen_reg_rtx (Pmode);
+  rtx zero_reg = gen_rtx_REG (Pmode, 0);
+
+  rtx aligned_addr = gen_reg_rtx (Pmode);
+  emit_move_insn (tmp_reg, gen_rtx_PLUS (Pmode, zero_reg, GEN_INT (-4)));
+  emit_move_insn (aligned_addr, gen_rtx_AND (Pmode, addr, tmp_reg));
+
+  rtx aligned_mem = change_address (mem, SImode, aligned_addr);
+  set_mem_alias_set (aligned_mem, 0);
+
+  rtx offset = gen_reg_rtx (SImode);
+  emit_move_insn (offset, gen_rtx_AND (SImode, gen_lowpart (SImode, addr),
+				       GEN_INT (3)));
+
+  rtx tmp = gen_reg_rtx (SImode);
+  emit_move_insn (tmp, GEN_INT (1));
+
+  rtx shmt = gen_reg_rtx (SImode);
+  emit_move_insn (shmt, gen_rtx_ASHIFT (SImode, offset, GEN_INT (3)));
+
+  rtx word = gen_reg_rtx (SImode);
+  emit_move_insn (word, gen_rtx_ASHIFT (SImode, tmp, shmt));
+
+  tmp = gen_reg_rtx (SImode);
+  emit_insn (gen_atomic_fetch_orsi (tmp, aligned_mem, word, model));
+
+  emit_move_insn (gen_lowpart (SImode, result),
+		  gen_rtx_LSHIFTRT (SImode, tmp, shmt));
+  DONE;
+})
+
+(define_insn "atomic_cas_value_cmp_and_7_<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&r")
+	(match_operand:GPR 1 "memory_operand" "+ZC"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR [(match_operand:GPR 2 "reg_or_0_operand" "rJ")
+			      (match_operand:GPR 3 "reg_or_0_operand" "rJ")
+			      (match_operand:GPR 4 "reg_or_0_operand"  "rJ")
+			      (match_operand:GPR 5 "reg_or_0_operand"  "rJ")
+			      (match_operand:SI 6 "const_int_operand")] ;; model
+	 UNSPEC_COMPARE_AND_SWAP))
+   (clobber (match_scratch:GPR 7 "=&r"))]
+  ""
+{
+  return "%G6\\n\\t"
+	 "1:\\n\\t"
+	 "ll.<amo>\\t%0,%1\\n\\t"
+	 "and\\t%7,%0,%2\\n\\t"
+	 "bne\\t%7,%z4,2f\\n\\t"
+	 "and\\t%7,%0,%z3\\n\\t"
+	 "or%i5\\t%7,%7,%5\\n\\t"
+	 "sc.<amo>\\t%7,%1\\n\\t"
+	 "beq\\t$zero,%7,1b\\n\\t"
+	 "b\\t3f\\n\\t"
+	 "2:\\n\\t"
+	 "dbar\\t0x700\\n\\t"
+	 "3:\\n\\t";
+}
+  [(set (attr "length") (const_int 40))])
+
+(define_expand "atomic_compare_and_swap<mode>"
+  [(match_operand:SI 0 "register_operand" "")   ;; bool output
+   (match_operand:SHORT 1 "register_operand" "")  ;; val output
+   (match_operand:SHORT 2 "memory_operand" "")    ;; memory
+   (match_operand:SHORT 3 "reg_or_0_operand" "")  ;; expected value
+   (match_operand:SHORT 4 "reg_or_0_operand" "")  ;; desired value
+   (match_operand:SI 5 "const_int_operand" "")  ;; is_weak
+   (match_operand:SI 6 "const_int_operand" "")  ;; mod_s
+   (match_operand:SI 7 "const_int_operand" "")] ;; mod_f
+  ""
+{
+  union loongarch_gen_fn_ptrs generator;
+  generator.fn_7 = gen_atomic_cas_value_cmp_and_7_si;
+  loongarch_expand_atomic_qihi (generator, operands[1], operands[2],
+				operands[3], operands[4], operands[7]);
+
+  rtx compare = operands[1];
+  if (operands[3] != const0_rtx)
+    {
+      machine_mode mode = GET_MODE (operands[3]);
+      rtx op1 = convert_modes (SImode, mode, operands[1], true);
+      rtx op3 = convert_modes (SImode, mode, operands[3], true);
+      rtx difference = gen_rtx_MINUS (SImode, op1, op3);
+      compare = gen_reg_rtx (SImode);
+      emit_insn (gen_rtx_SET (compare, difference));
+    }
+
+  if (word_mode != <MODE>mode)
+    {
+      rtx reg = gen_reg_rtx (word_mode);
+      emit_insn (gen_rtx_SET (reg, gen_rtx_SIGN_EXTEND (word_mode, compare)));
+      compare = reg;
+    }
+
+  emit_insn (gen_rtx_SET (operands[0],
+			  gen_rtx_EQ (SImode, compare, const0_rtx)));
+  DONE;
+})
+
+(define_insn "atomic_cas_value_add_7_<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&r")				;; res
+	(match_operand:GPR 1 "memory_operand" "+ZC"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR [(match_operand:GPR 2 "reg_or_0_operand" "rJ")	;; mask
+			      (match_operand:GPR 3 "reg_or_0_operand" "rJ")	;; inverted_mask
+			      (match_operand:GPR 4 "reg_or_0_operand"  "rJ")	;; old val
+			      (match_operand:GPR 5 "reg_or_0_operand"  "rJ")	;; new val
+			      (match_operand:SI 6 "const_int_operand")]		;; model
+	 UNSPEC_COMPARE_AND_SWAP_ADD))
+   (clobber (match_scratch:GPR 7 "=&r"))
+   (clobber (match_scratch:GPR 8 "=&r"))]
+  ""
+{
+  return "%G6\\n\\t"
+	 "1:\\n\\t"
+	 "ll.<amo>\\t%0,%1\\n\\t"
+	 "and\\t%7,%0,%3\\n\\t"
+	 "add.w\\t%8,%0,%z5\\n\\t"
+	 "and\\t%8,%8,%z2\\n\\t"
+	 "or%i8\\t%7,%7,%8\\n\\t"
+	 "sc.<amo>\\t%7,%1\\n\\t"
+	 "beq\\t$zero,%7,1b";
+}
+
+  [(set (attr "length") (const_int 32))])
+
+(define_insn "atomic_cas_value_sub_7_<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&r")				;; res
+	(match_operand:GPR 1 "memory_operand" "+ZC"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR [(match_operand:GPR 2 "reg_or_0_operand" "rJ")	;; mask
+			      (match_operand:GPR 3 "reg_or_0_operand" "rJ")	;; inverted_mask
+			      (match_operand:GPR 4 "reg_or_0_operand"  "rJ")	;; old val
+			      (match_operand:GPR 5 "reg_or_0_operand"  "rJ")	;; new val
+			      (match_operand:SI 6 "const_int_operand")]		;; model
+	 UNSPEC_COMPARE_AND_SWAP_SUB))
+   (clobber (match_scratch:GPR 7 "=&r"))
+   (clobber (match_scratch:GPR 8 "=&r"))]
+  ""
+{
+  return "%G6\\n\\t"
+	 "1:\\n\\t"
+	 "ll.<amo>\\t%0,%1\\n\\t"
+	 "and\\t%7,%0,%3\\n\\t"
+	 "sub.w\\t%8,%0,%z5\\n\\t"
+	 "and\\t%8,%8,%z2\\n\\t"
+	 "or%i8\\t%7,%7,%8\\n\\t"
+	 "sc.<amo>\\t%7,%1\\n\\t"
+	 "beq\\t$zero,%7,1b";
+}
+  [(set (attr "length") (const_int 32))])
+
+(define_insn "atomic_cas_value_and_7_<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&r")				;; res
+	(match_operand:GPR 1 "memory_operand" "+ZC"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR [(match_operand:GPR 2 "reg_or_0_operand" "rJ")	;; mask
+			      (match_operand:GPR 3 "reg_or_0_operand" "rJ")	;; inverted_mask
+			      (match_operand:GPR 4 "reg_or_0_operand"  "rJ")	;; old val
+			      (match_operand:GPR 5 "reg_or_0_operand"  "rJ")	;; new val
+			      (match_operand:SI 6 "const_int_operand")]		;; model
+	 UNSPEC_COMPARE_AND_SWAP_AND))
+   (clobber (match_scratch:GPR 7 "=&r"))
+   (clobber (match_scratch:GPR 8 "=&r"))]
+  ""
+{
+  return "%G6\\n\\t"
+	 "1:\\n\\t"
+	 "ll.<amo>\\t%0,%1\\n\\t"
+	 "and\\t%7,%0,%3\\n\\t"
+	 "and\\t%8,%0,%z5\\n\\t"
+	 "and\\t%8,%8,%z2\\n\\t"
+	 "or%i8\\t%7,%7,%8\\n\\t"
+	 "sc.<amo>\\t%7,%1\\n\\t"
+	 "beq\\t$zero,%7,1b";
+}
+  [(set (attr "length") (const_int 32))])
+
+(define_insn "atomic_cas_value_xor_7_<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&r")				;; res
+	(match_operand:GPR 1 "memory_operand" "+ZC"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR [(match_operand:GPR 2 "reg_or_0_operand" "rJ")	;; mask
+			      (match_operand:GPR 3 "reg_or_0_operand" "rJ")	;; inverted_mask
+			      (match_operand:GPR 4 "reg_or_0_operand"  "rJ")	;; old val
+			      (match_operand:GPR 5 "reg_or_0_operand"  "rJ")	;; new val
+			      (match_operand:SI 6 "const_int_operand")]		;; model
+	 UNSPEC_COMPARE_AND_SWAP_XOR))
+   (clobber (match_scratch:GPR 7 "=&r"))
+   (clobber (match_scratch:GPR 8 "=&r"))]
+  ""
+{
+  return "%G6\\n\\t"
+	 "1:\\n\\t"
+	 "ll.<amo>\\t%0,%1\\n\\t"
+	 "and\\t%7,%0,%3\\n\\t"
+	 "xor\\t%8,%0,%z5\\n\\t"
+	 "and\\t%8,%8,%z2\\n\\t"
+	 "or%i8\\t%7,%7,%8\\n\\t"
+	 "sc.<amo>\\t%7,%1\\n\\t"
+	 "beq\\t$zero,%7,1b";
+}
+
+  [(set (attr "length") (const_int 32))])
+
+(define_insn "atomic_cas_value_or_7_<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&r")				;; res
+	(match_operand:GPR 1 "memory_operand" "+ZC"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR [(match_operand:GPR 2 "reg_or_0_operand" "rJ")	;; mask
+			      (match_operand:GPR 3 "reg_or_0_operand" "rJ")	;; inverted_mask
+			      (match_operand:GPR 4 "reg_or_0_operand"  "rJ")	;; old val
+			      (match_operand:GPR 5 "reg_or_0_operand"  "rJ")	;; new val
+			      (match_operand:SI 6 "const_int_operand")]		;; model
+	 UNSPEC_COMPARE_AND_SWAP_OR))
+   (clobber (match_scratch:GPR 7 "=&r"))
+   (clobber (match_scratch:GPR 8 "=&r"))]
+  ""
+{
+  return "%G6\\n\\t"
+	 "1:\\n\\t"
+	 "ll.<amo>\\t%0,%1\\n\\t"
+	 "and\\t%7,%0,%3\\n\\t"
+	 "or\\t%8,%0,%z5\\n\\t"
+	 "and\\t%8,%8,%z2\\n\\t"
+	 "or%i8\\t%7,%7,%8\\n\\t"
+	 "sc.<amo>\\t%7,%1\\n\\t"
+	 "beq\\t$zero,%7,1b";
+}
+
+  [(set (attr "length") (const_int 32))])
+
+(define_insn "atomic_cas_value_nand_7_<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&r")				;; res
+	(match_operand:GPR 1 "memory_operand" "+ZC"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR [(match_operand:GPR 2 "reg_or_0_operand" "rJ")	;; mask
+			      (match_operand:GPR 3 "reg_or_0_operand" "rJ")	;; inverted_mask
+			      (match_operand:GPR 4 "reg_or_0_operand"  "rJ")	;; old val
+			      (match_operand:GPR 5 "reg_or_0_operand"  "rJ")	;; new val
+			      (match_operand:SI 6 "const_int_operand")]		;; model
+	 UNSPEC_COMPARE_AND_SWAP_NAND))
+   (clobber (match_scratch:GPR 7 "=&r"))
+   (clobber (match_scratch:GPR 8 "=&r"))]
+  ""
+{
+  return "%G6\\n\\t"
+	 "1:\\n\\t"
+	 "ll.<amo>\\t%0,%1\\n\\t"
+	 "and\\t%7,%0,%3\\n\\t"
+	 "and\\t%8,%0,%z5\\n\\t"
+	 "xor\\t%8,%8,%z2\\n\\t"
+	 "or%i8\\t%7,%7,%8\\n\\t"
+	 "sc.<amo>\\t%7,%1\\n\\t"
+	 "beq\\t$zero,%7,1b";
+}
+  [(set (attr "length") (const_int 32))])
+
+(define_expand "atomic_exchange<mode>"
+  [(set (match_operand:SHORT 0 "register_operand")
+	(unspec_volatile:SHORT
+	  [(match_operand:SHORT 1 "memory_operand")
+	   (match_operand:SI 3 "const_int_operand")] ;; model
+	  UNSPEC_SYNC_EXCHANGE))
+   (set (match_dup 1)
+	(match_operand:SHORT 2 "register_operand"))]
+  ""
+{
+  union loongarch_gen_fn_ptrs generator;
+  generator.fn_7 = gen_atomic_cas_value_cmp_and_7_si;
+  loongarch_expand_atomic_qihi (generator, operands[0], operands[1],
+				operands[1], operands[2], operands[3]);
+  DONE;
+})
+
+(define_expand "atomic_fetch_add<mode>"
+  [(set (match_operand:SHORT 0 "register_operand" "=&r")
+	(match_operand:SHORT 1 "memory_operand" "+ZB"))
+   (set (match_dup 1)
+	(unspec_volatile:SHORT
+	  [(plus:SHORT (match_dup 1)
+		       (match_operand:SHORT 2 "reg_or_0_operand" "rJ"))
+	   (match_operand:SI 3 "const_int_operand")] ;; model
+	 UNSPEC_SYNC_OLD_OP))]
+  ""
+{
+  union loongarch_gen_fn_ptrs generator;
+  generator.fn_7 = gen_atomic_cas_value_add_7_si;
+  loongarch_expand_atomic_qihi (generator, operands[0], operands[1],
+				operands[1], operands[2], operands[3]);
+  DONE;
+})
+
+(define_expand "atomic_fetch_sub<mode>"
+  [(set (match_operand:SHORT 0 "register_operand" "=&r")
+	(match_operand:SHORT 1 "memory_operand" "+ZB"))
+   (set (match_dup 1)
+	(unspec_volatile:SHORT
+	  [(minus:SHORT (match_dup 1)
+			(match_operand:SHORT 2 "reg_or_0_operand" "rJ"))
+	   (match_operand:SI 3 "const_int_operand")] ;; model
+	 UNSPEC_SYNC_OLD_OP))]
+  ""
+{
+  union loongarch_gen_fn_ptrs generator;
+  generator.fn_7 = gen_atomic_cas_value_sub_7_si;
+  loongarch_expand_atomic_qihi (generator, operands[0], operands[1],
+				operands[1], operands[2], operands[3]);
+  DONE;
+})
+
+(define_expand "atomic_fetch_and<mode>"
+  [(set (match_operand:SHORT 0 "register_operand" "=&r")
+	(match_operand:SHORT 1 "memory_operand" "+ZB"))
+   (set (match_dup 1)
+	(unspec_volatile:SHORT
+	  [(and:SHORT (match_dup 1)
+		      (match_operand:SHORT 2 "reg_or_0_operand" "rJ"))
+	   (match_operand:SI 3 "const_int_operand")] ;; model
+	 UNSPEC_SYNC_OLD_OP))]
+  ""
+{
+  union loongarch_gen_fn_ptrs generator;
+  generator.fn_7 = gen_atomic_cas_value_and_7_si;
+  loongarch_expand_atomic_qihi (generator, operands[0], operands[1],
+				operands[1], operands[2], operands[3]);
+  DONE;
+})
+
+(define_expand "atomic_fetch_xor<mode>"
+  [(set (match_operand:SHORT 0 "register_operand" "=&r")
+	(match_operand:SHORT 1 "memory_operand" "+ZB"))
+   (set (match_dup 1)
+	(unspec_volatile:SHORT
+	  [(xor:SHORT (match_dup 1)
+		      (match_operand:SHORT 2 "reg_or_0_operand" "rJ"))
+	   (match_operand:SI 3 "const_int_operand")] ;; model
+	 UNSPEC_SYNC_OLD_OP))]
+  ""
+{
+  union loongarch_gen_fn_ptrs generator;
+  generator.fn_7 = gen_atomic_cas_value_xor_7_si;
+  loongarch_expand_atomic_qihi (generator, operands[0], operands[1],
+				operands[1], operands[2], operands[3]);
+  DONE;
+})
+
+(define_expand "atomic_fetch_or<mode>"
+  [(set (match_operand:SHORT 0 "register_operand" "=&r")
+	(match_operand:SHORT 1 "memory_operand" "+ZB"))
+   (set (match_dup 1)
+	(unspec_volatile:SHORT
+	  [(ior:SHORT (match_dup 1)
+		      (match_operand:SHORT 2 "reg_or_0_operand" "rJ"))
+	   (match_operand:SI 3 "const_int_operand")] ;; model
+	 UNSPEC_SYNC_OLD_OP))]
+  ""
+{
+  union loongarch_gen_fn_ptrs generator;
+  generator.fn_7 = gen_atomic_cas_value_or_7_si;
+  loongarch_expand_atomic_qihi (generator, operands[0], operands[1],
+				operands[1], operands[2], operands[3]);
+  DONE;
+})
+
+(define_expand "atomic_fetch_nand<mode>"
+  [(set (match_operand:SHORT 0 "register_operand" "=&r")
+	(match_operand:SHORT 1 "memory_operand" "+ZB"))
+   (set (match_dup 1)
+	(unspec_volatile:SHORT
+	  [(not:SHORT (and:SHORT (match_dup 1)
+				 (match_operand:SHORT 2 "reg_or_0_operand" "rJ")))
+	   (match_operand:SI 3 "const_int_operand")] ;; model
+	 UNSPEC_SYNC_OLD_OP))]
+  ""
+{
+  union loongarch_gen_fn_ptrs generator;
+  generator.fn_7 = gen_atomic_cas_value_nand_7_si;
+  loongarch_expand_atomic_qihi (generator, operands[0], operands[1],
+				operands[1], operands[2], operands[3]);
+  DONE;
+})