From patchwork Fri Jul 28 15:06:17 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Tamar Christina <Tamar.Christina@arm.com>
X-Patchwork-Id: 794947
Return-Path: 
 <gcc-patches-return-459272-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Authentication-Results: ozlabs.org;
	spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org
	(client-ip=209.132.180.131; helo=sourceware.org;
	envelope-from=gcc-patches-return-459272-incoming=patchwork.ozlabs.org@gcc.gnu.org;
	receiver=<UNKNOWN>)
Authentication-Results: ozlabs.org; dkim=pass (1024-bit key;
	unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org
	header.b="mzSDkgRb"; dkim-atps=neutral
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 3xJsd61Bthz9s0g
	for <incoming@patchwork.ozlabs.org>;
	Sat, 29 Jul 2017 01:06:48 +1000 (AEST)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender:from
	:to:cc:subject:date:message-id:references:in-reply-to
	:content-type:mime-version; q=dns; s=default; b=ePmafdxZcuyhp+O/
	EA2M17gqrTRnDHZPANwU9XrrQ8n8Y0JwREplVdOl7ncu1kGeU8pmLz1Uw0FuvlEt
	W+fdngnd+ivtDu7n2zKfGGLhRdH2XE4ywWdyVZNMef6YdFvzCYxP0sm3OgTiok/h
	sfU2KEVbYyNK+m18mDx4hIlo5mo=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender:from
	:to:cc:subject:date:message-id:references:in-reply-to
	:content-type:mime-version; s=default; bh=rWi9C0yWQhLu+f0JHhAms5
	u1Gkc=; b=mzSDkgRbhqBYxXuHVNnwDQ0wVNAwiTJfOTlOev8yJFfRn+2NnPJ+NK
	UBLxNtUEHTA5Tp+NeVSP8oPaViIG5tNYNRAZuwx4Kbw4YZIrDm+vwaQIEXJoqksL
	KjZod195XA2DYyNXDFEIfLKNxqD6h2xQUyR5H8zCU7aGeQC9i8DTs=
Received: (qmail 53987 invoked by alias); 28 Jul 2017 15:06:36 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: 
 <mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 53881 invoked by uid 89); 28 Jul 2017 15:06:26 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-25.4 required=5.0 tests=AWL, BAYES_00,
	GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3,
	KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS,
	SPF_PASS autolearn=ham version=3.3.2 spammy=3519
X-HELO: EUR01-HE1-obe.outbound.protection.outlook.com
Received: from mail-he1eur01on0045.outbound.protection.outlook.com (HELO
	EUR01-HE1-obe.outbound.protection.outlook.com) (104.47.0.45)
	by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with
	ESMTP; Fri, 28 Jul 2017 15:06:22 +0000
Received: from DB6PR0802MB2309.eurprd08.prod.outlook.com (10.172.228.13) by
	DB6PR0802MB2615.eurprd08.prod.outlook.com (10.172.252.20)
	with Microsoft SMTP Server (version=TLS1_2,
	cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id
	15.1.1282.10; Fri, 28 Jul 2017 15:06:18 +0000
Received: from DB6PR0802MB2309.eurprd08.prod.outlook.com
	([fe80::bdbc:cb6a:43bc:90f9]) by
	DB6PR0802MB2309.eurprd08.prod.outlook.com
	([fe80::bdbc:cb6a:43bc:90f9%17]) with mapi id 15.01.1282.023;
	Fri, 28 Jul 2017 15:06:18 +0000
From: Tamar Christina <Tamar.Christina@arm.com>
To: James Greenhalgh <James.Greenhalgh@arm.com>
CC: Richard Sandiford <richard.sandiford@linaro.org>,
	GCC Patches	<gcc-patches@gcc.gnu.org>, nd <nd@arm.com>,
	Marcus Shawcroft	<Marcus.Shawcroft@arm.com>,
	Richard Earnshaw <Richard.Earnshaw@arm.com>
Subject: Re: [PATCH][GCC][AArch64] optimize float immediate moves (1 /4) -
	infrastructure.
Date: Fri, 28 Jul 2017 15:06:17 +0000
Message-ID: 
 <DB6PR0802MB23093F5F45CA227BD2F5BE36FFBF0@DB6PR0802MB2309.eurprd08.prod.outlook.com>
References: 
 <VI1PR0801MB203164A5E1F6B6EDA0F2074AFFC80@VI1PR0801MB2031.eurprd08.prod.outlook.com>
	<8760g6bwig.fsf@linaro.org>
	<VI1PR0801MB2031CA473713EDC999DF103CFFCD0@VI1PR0801MB2031.eurprd08.prod.outlook.com>
	<20170613163934.GA1372@arm.com>
	<VI1PR0801MB20314B00B267D5B6D4CE7DD8FFC00@VI1PR0801MB2031.eurprd08.prod.outlook.com>
	<VI1PR0801MB203183E948429F85BFB14E3CFFDF0@VI1PR0801MB2031.eurprd08.prod.outlook.com>
	<20170719154731.GA35511@arm.com>
	<DB6PR0802MB2309B9E92F8E6B8BD6AEFF0FFFB90@DB6PR0802MB2309.eurprd08.prod.outlook.com>,
	<20170727154237.GA31556@arm.com>
In-Reply-To: <20170727154237.GA31556@arm.com>
authentication-results: spf=none (sender IP is )
	smtp.mailfrom=Tamar.Christina@arm.com;
x-ms-publictraffictype: Email
x-microsoft-exchange-diagnostics: 1; DB6PR0802MB2615;
	7:ApUzmHgPyO5Yc5Iyopj971APwSLTVxuMO1+CGEix2kHsulT9m3E3HpG74KyWm4Gpnr/IUIj2QaT7XWvRmCDhfqUFqHdpB8UNWmEfwJ1kFY7DEBKt+qjvxX8qzT/1GCVs2Kulv++wIaeYuMGtxQ8MuoeZ2NibYLPR2ElYq9Io0ycHP0A0wNRawLEXNo2W4TghpzzKHp83BprXv2AKEejxPRrB1UQ5RMDgXkEJs1/N2egHF1YhnNlup0pX2HIvAl2bTWKxH+Q7yLa5aQqkQlLiarUfkNZ/zkq4vl6f76S9q+kRJ+lXJoXAYSpgXombyZ8m920RbeMQ7lQza9eLkYc8yAMxVGfJry8Vv/W0t1QUM/ivRYSRTc1m+r9y72U6VLhGFmmON/At7LyWixyTT8Ku9b2Oocr799BhhE1x3MoNToQcvCKc0tvzGshVuwfZlfH/DVZpVWVDANTJJz4Kyt5x0XXcw+DEyQLmDHxv7bHn0MjNCMwOtFVST+wfHGey3AaWEaU3Npv1s4pwtmXRlIF3tv5aa6HYqUJZq7Na7n3oFhnoPq4CROWqcoNAwU7zsLl748+1CbR4Gzx/FguWLxCBoWOsyILqmwvVABG8gJXqPladqVXoKoaL7YrIlciE7tvsEMxR2DSxjdATY6AcUPh1HQDfa+WtCVzICj/O+mU04AHeh043wEFfkYiK1SjXUJF9+U5clIQUVjBR3q+nEKVo5lUjJtYdhAh6hUJ6L3D8e7tWZjt7rbSWMsxLiBxbk5c5xpTseJidq2DOzUuX9Et6y5RSueo6OalraPupsUVJujM=
x-ms-office365-filtering-correlation-id: 0eb9640a-dda0-4390-f3f4-08d4d5ca331d
x-ms-office365-filtering-ht: Tenant
x-microsoft-antispam: UriScan:; BCL:0; PCL:0;
	RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254127)(300000503095)(300135400095)(48565401081)(2017052603031)(49563074)(201703131423075)(201703031133081)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095);
	SRVR:DB6PR0802MB2615;
x-ms-traffictypediagnostic: DB6PR0802MB2615:
nodisclaimer: True
x-exchange-antispam-report-test: UriScan:(180628864354917);
x-microsoft-antispam-prvs: 
 <DB6PR0802MB26156014B58A1109868C7EDCFFBF0@DB6PR0802MB2615.eurprd08.prod.outlook.com>
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0;
	RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(102415395)(6040450)(601004)(2401047)(8121501046)(5005006)(100000703101)(100105400095)(3002001)(93006095)(93001095)(10201501046)(6055026)(6041248)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123558100)(20161123562025)(20161123564025)(20161123560025)(20161123555025)(6072148)(100000704101)(100105200095)(100000705101)(100105500095);
	SRVR:DB6PR0802MB2615; BCL:0; PCL:0;
	RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);
	SRVR:DB6PR0802MB2615;
x-forefront-prvs: 03827AF76E
x-forefront-antispam-report: SFV:NSPM;
	SFS:(10009020)(6009001)(39850400002)(39840400002)(39410400002)(39400400002)(39450400003)(51914003)(377454003)(199003)(189002)(24454002)(74316002)(25786009)(102836003)(6116002)(3846002)(2950100002)(4326008)(6636002)(2906002)(5660300001)(101416001)(54356999)(575784001)(55016002)(99936001)(50986999)(76176999)(86362001)(99286003)(53936002)(2900100001)(7736002)(6436002)(5250100002)(9686003)(54906002)(6862004)(8936002)(110136004)(3660700001)(68736007)(6246003)(38730400002)(8676002)(33656002)(3280700002)(6506006)(229853002)(105586002)(305945005)(106356001)(93886004)(478600001)(72206003)(7696004)(66066001)(14454004)(53546010)(81166006)(97736004)(81156014)(189998001)(14773001);
	DIR:OUT; SFP:1101; SCL:1; SRVR:DB6PR0802MB2615;
	H:DB6PR0802MB2309.eurprd08.prod.outlook.com; FPR:; SPF:None;
	PTR:InfoNoRecords; A:1; MX:1; LANG:en;
received-spf: None (protection.outlook.com: arm.com does not designate
	permitted sender hosts)
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
MIME-Version: 1.0
X-OriginatorOrg: arm.com
X-MS-Exchange-CrossTenant-originalarrivaltime: 28 Jul 2017 15:06:17.9887
	(UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0802MB2615
X-IsSubscribed: yes

Hi James,

I have made a tiny change in the patch to allow 0 to always pass through, since in all cases we can always do a move of 0.

I will assume your OK to still stand under the obvious rule and will commit with this change.

Thanks,
Tamar
diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index e397ff4afa73cfbc7e192fd5686b1beff9bbbadf..beff28e2272b7c771c5ae5f3e17f10fc5f9711d0 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -319,6 +319,7 @@ unsigned HOST_WIDE_INT aarch64_and_split_imm2 (HOST_WIDE_INT val_in);
 bool aarch64_and_bitmask_imm (unsigned HOST_WIDE_INT val_in, machine_mode mode);
 int aarch64_branch_cost (bool, bool);
 enum aarch64_symbol_type aarch64_classify_symbolic_expression (rtx);
+bool aarch64_can_const_movi_rtx_p (rtx x, machine_mode mode);
 bool aarch64_const_vec_all_same_int_p (rtx, HOST_WIDE_INT);
 bool aarch64_constant_address_p (rtx);
 bool aarch64_emit_approx_div (rtx, rtx, rtx);
@@ -326,6 +327,7 @@ bool aarch64_emit_approx_sqrt (rtx, rtx, bool);
 void aarch64_expand_call (rtx, rtx, bool);
 bool aarch64_expand_movmem (rtx *);
 bool aarch64_float_const_zero_rtx_p (rtx);
+bool aarch64_float_const_rtx_p (rtx);
 bool aarch64_function_arg_regno_p (unsigned);
 bool aarch64_fusion_enabled_p (enum aarch64_fusion_pairs);
 bool aarch64_gen_movmemqi (rtx *);
@@ -351,9 +353,9 @@ bool aarch64_pad_arg_upward (machine_mode, const_tree);
 bool aarch64_pad_reg_upward (machine_mode, const_tree, bool);
 bool aarch64_regno_ok_for_base_p (int, bool);
 bool aarch64_regno_ok_for_index_p (int, bool);
+bool aarch64_reinterpret_float_as_int (rtx value, unsigned HOST_WIDE_INT *fail);
 bool aarch64_simd_check_vect_par_cnst_half (rtx op, machine_mode mode,
 					    bool high);
-bool aarch64_simd_imm_scalar_p (rtx x, machine_mode mode);
 bool aarch64_simd_imm_zero_p (rtx, machine_mode);
 bool aarch64_simd_scalar_immediate_valid_for_move (rtx, machine_mode);
 bool aarch64_simd_shift_imm_p (rtx, machine_mode, bool);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index d753666ef67fc524260c41f36743df3649a0a98a..79383ad80090b19a3e604fa7c91250ffc629f682 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -147,6 +147,8 @@ static bool aarch64_builtin_support_vector_misalignment (machine_mode mode,
 							 const_tree type,
 							 int misalignment,
 							 bool is_packed);
+static machine_mode
+aarch64_simd_container_mode (machine_mode mode, unsigned width);
 
 /* Major revision number of the ARM Architecture implemented by the target.  */
 unsigned aarch64_architecture_version;
@@ -4723,6 +4725,69 @@ aarch64_legitimize_address_displacement (rtx *disp, rtx *off, machine_mode mode)
   return true;
 }
 
+/* Return the binary representation of floating point constant VALUE in INTVAL.
+   If the value cannot be converted, return false without setting INTVAL.
+   The conversion is done in the given MODE.  */
+bool
+aarch64_reinterpret_float_as_int (rtx value, unsigned HOST_WIDE_INT *intval)
+{
+
+  /* We make a general exception for 0.  */
+  if (aarch64_float_const_zero_rtx_p (value))
+    {
+      *intval = 0;
+      return true;
+    }
+
+  machine_mode mode = GET_MODE (value);
+  if (GET_CODE (value) != CONST_DOUBLE
+      || !SCALAR_FLOAT_MODE_P (mode)
+      || GET_MODE_BITSIZE (mode) > HOST_BITS_PER_WIDE_INT
+      /* Only support up to DF mode.  */
+      || GET_MODE_BITSIZE (mode) > GET_MODE_BITSIZE (DFmode))
+    return false;
+
+  unsigned HOST_WIDE_INT ival = 0;
+
+  long res[2];
+  real_to_target (res,
+		  CONST_DOUBLE_REAL_VALUE (value),
+		  REAL_MODE_FORMAT (mode));
+
+  ival = zext_hwi (res[0], 32);
+  if (GET_MODE_BITSIZE (mode) == GET_MODE_BITSIZE (DFmode))
+    ival |= (zext_hwi (res[1], 32) << 32);
+
+  *intval = ival;
+  return true;
+}
+
+/* Return TRUE if rtx X is an immediate constant that can be moved using a
+   single MOV(+MOVK) followed by an FMOV.  */
+bool
+aarch64_float_const_rtx_p (rtx x)
+{
+  machine_mode mode = GET_MODE (x);
+  if (mode == VOIDmode)
+    return false;
+
+  /* Determine whether it's cheaper to write float constants as
+     mov/movk pairs over ldr/adrp pairs.  */
+  unsigned HOST_WIDE_INT ival;
+
+  if (GET_CODE (x) == CONST_DOUBLE
+      && SCALAR_FLOAT_MODE_P (mode)
+      && aarch64_reinterpret_float_as_int (x, &ival))
+    {
+      machine_mode imode = mode == HFmode ? SImode : int_mode_for_mode (mode);
+      int num_instr = aarch64_internal_mov_immediate
+			(NULL_RTX, gen_int_mode (ival, imode), false, imode);
+      return num_instr < 3;
+    }
+
+  return false;
+}
+
 /* Return TRUE if rtx X is immediate constant 0.0 */
 bool
 aarch64_float_const_zero_rtx_p (rtx x)
@@ -4735,6 +4800,49 @@ aarch64_float_const_zero_rtx_p (rtx x)
   return real_equal (CONST_DOUBLE_REAL_VALUE (x), &dconst0);
 }
 
+/* Return TRUE if rtx X is immediate constant that fits in a single
+   MOVI immediate operation.  */
+bool
+aarch64_can_const_movi_rtx_p (rtx x, machine_mode mode)
+{
+  if (!TARGET_SIMD)
+     return false;
+
+  /* We make a general exception for 0.  */
+  if (aarch64_float_const_zero_rtx_p (x))
+      return true;
+
+  machine_mode vmode, imode;
+  unsigned HOST_WIDE_INT ival;
+
+  if (GET_CODE (x) == CONST_DOUBLE
+      && SCALAR_FLOAT_MODE_P (mode))
+    {
+      if (!aarch64_reinterpret_float_as_int (x, &ival))
+	return false;
+
+      imode = int_mode_for_mode (mode);
+    }
+  else if (GET_CODE (x) == CONST_INT
+	   && SCALAR_INT_MODE_P (mode))
+    {
+       imode = mode;
+       ival = INTVAL (x);
+    }
+  else
+    return false;
+
+   /* use a 64 bit mode for everything except for DI/DF mode, where we use
+     a 128 bit vector mode.  */
+  int width = GET_MODE_BITSIZE (mode) == 64 ? 128 : 64;
+
+  vmode = aarch64_simd_container_mode (imode, width);
+  rtx v_op = aarch64_simd_gen_const_vector_dup (vmode, ival);
+
+  return aarch64_simd_valid_immediate (v_op, vmode, false, NULL);
+}
+
+
 /* Return the fixed registers used for condition codes.  */
 
 static bool
@@ -5929,12 +6037,6 @@ aarch64_preferred_reload_class (rtx x, reg_class_t regclass)
       return NO_REGS;
     }
 
-  /* If it's an integer immediate that MOVI can't handle, then
-     FP_REGS is not an option, so we return NO_REGS instead.  */
-  if (CONST_INT_P (x) && reg_class_subset_p (regclass, FP_REGS)
-      && !aarch64_simd_imm_scalar_p (x, GET_MODE (x)))
-    return NO_REGS;
-
   /* Register eliminiation can result in a request for
      SP+constant->FP_REGS.  We cannot support such operations which
      use SP as source and an FP_REG as destination, so reject out
@@ -6884,6 +6986,25 @@ aarch64_rtx_costs (rtx x, machine_mode mode, int outer ATTRIBUTE_UNUSED,
       return true;
 
     case CONST_DOUBLE:
+
+      /* First determine number of instructions to do the move
+	  as an integer constant.  */
+      if (!aarch64_float_const_representable_p (x)
+	   && !aarch64_can_const_movi_rtx_p (x, mode)
+	   && aarch64_float_const_rtx_p (x))
+	{
+	  unsigned HOST_WIDE_INT ival;
+	  bool succeed = aarch64_reinterpret_float_as_int (x, &ival);
+	  gcc_assert (succeed);
+
+	  machine_mode imode = mode == HFmode ? SImode
+					      : int_mode_for_mode (mode);
+	  int ncost = aarch64_internal_mov_immediate
+		(NULL_RTX, gen_int_mode (ival, imode), false, imode);
+	  *cost += COSTS_N_INSNS (ncost);
+	  return true;
+	}
+
       if (speed)
 	{
 	  /* mov[df,sf]_aarch64.  */
@@ -10228,18 +10349,16 @@ aarch64_legitimate_pic_operand_p (rtx x)
 /* Return true if X holds either a quarter-precision or
      floating-point +0.0 constant.  */
 static bool
-aarch64_valid_floating_const (machine_mode mode, rtx x)
+aarch64_valid_floating_const (rtx x)
 {
   if (!CONST_DOUBLE_P (x))
     return false;
 
-  if (aarch64_float_const_zero_rtx_p (x))
+  /* This call determines which constants can be used in mov<mode>
+     as integer moves instead of constant loads.  */
+  if (aarch64_float_const_rtx_p (x))
     return true;
 
-  /* We only handle moving 0.0 to a TFmode register.  */
-  if (!(mode == SFmode || mode == DFmode))
-    return false;
-
   return aarch64_float_const_representable_p (x);
 }
 
@@ -10251,11 +10370,15 @@ aarch64_legitimate_constant_p (machine_mode mode, rtx x)
   if (TARGET_SIMD && aarch64_vect_struct_mode_p (mode))
     return false;
 
-  /* This could probably go away because
-     we now decompose CONST_INTs according to expand_mov_immediate.  */
+  /* For these cases we never want to use a literal load.
+     As such we have to prevent the compiler from forcing these
+     to memory.  */
   if ((GET_CODE (x) == CONST_VECTOR
        && aarch64_simd_valid_immediate (x, mode, false, NULL))
-      || CONST_INT_P (x) || aarch64_valid_floating_const (mode, x))
+      || CONST_INT_P (x)
+      || aarch64_valid_floating_const (x)
+      || aarch64_can_const_movi_rtx_p (x, mode)
+      || aarch64_float_const_rtx_p (x))
 	return !targetm.cannot_force_const_mem (mode, x);
 
   if (GET_CODE (x) == HIGH
@@ -11538,23 +11661,6 @@ aarch64_mask_from_zextract_ops (rtx width, rtx pos)
 }
 
 bool
-aarch64_simd_imm_scalar_p (rtx x, machine_mode mode ATTRIBUTE_UNUSED)
-{
-  HOST_WIDE_INT imm = INTVAL (x);
-  int i;
-
-  for (i = 0; i < 8; i++)
-    {
-      unsigned int byte = imm & 0xff;
-      if (byte != 0xff && byte != 0)
-       return false;
-      imm >>= 8;
-    }
-
-  return true;
-}
-
-bool
 aarch64_mov_operand_p (rtx x, machine_mode mode)
 {
   if (GET_CODE (x) == HIGH
@@ -12945,15 +13051,28 @@ aarch64_output_simd_mov_immediate (rtx const_vector,
 }
 
 char*
-aarch64_output_scalar_simd_mov_immediate (rtx immediate,
-					  machine_mode mode)
+aarch64_output_scalar_simd_mov_immediate (rtx immediate,  machine_mode mode)
 {
+
+  /* If a floating point number was passed and we desire to use it in an
+     integer mode do the conversion to integer.  */
+  if (CONST_DOUBLE_P (immediate) && GET_MODE_CLASS (mode) == MODE_INT)
+    {
+      unsigned HOST_WIDE_INT ival;
+      if (!aarch64_reinterpret_float_as_int (immediate, &ival))
+	  gcc_unreachable ();
+      immediate = gen_int_mode (ival, mode);
+    }
+
   machine_mode vmode;
+  /* use a 64 bit mode for everything except for DI/DF mode, where we use
+     a 128 bit vector mode.  */
+  int width = GET_MODE_BITSIZE (mode) == 64 ? 128 : 64;
 
   gcc_assert (!VECTOR_MODE_P (mode));
-  vmode = aarch64_simd_container_mode (mode, 64);
+  vmode = aarch64_simd_container_mode (mode, width);
   rtx v_op = aarch64_simd_gen_const_vector_dup (vmode, INTVAL (immediate));
-  return aarch64_output_simd_mov_immediate (v_op, vmode, 64);
+  return aarch64_output_simd_mov_immediate (v_op, vmode, width);
 }
 
 /* Split operands into moves from op[1] + op[2] into op[0].  */
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index f876a2b720852d7ae40b42499638fa15f872aeb0..e6edb4af5f36cbd5a3e53be0554163f9ce2ba1ca 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -920,8 +920,8 @@
 )
 
 (define_insn_and_split "*movsi_aarch64"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r,r,*w,m,  m,r,r  ,*w,r,*w")
-	(match_operand:SI 1 "aarch64_mov_operand"  " r,r,k,M,n,m, m,rZ,*w,Usa,Ush,rZ,w,*w"))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r,r,*w,m,  m,r,r  ,*w, r,*w,w")
+	(match_operand:SI 1 "aarch64_mov_operand"  " r,r,k,M,n,m, m,rZ,*w,Usa,Ush,rZ,w,*w,Ds"))]
   "(register_operand (operands[0], SImode)
     || aarch64_reg_or_zero (operands[1], SImode))"
   "@
@@ -938,8 +938,9 @@
    adrp\\t%x0, %A1
    fmov\\t%s0, %w1
    fmov\\t%w0, %s1
-   fmov\\t%s0, %s1"
-   "CONST_INT_P (operands[1]) && !aarch64_move_imm (INTVAL (operands[1]), SImode)
+   fmov\\t%s0, %s1
+   * return aarch64_output_scalar_simd_mov_immediate (operands[1], SImode);"
+  "CONST_INT_P (operands[1]) && !aarch64_move_imm (INTVAL (operands[1]), SImode)
     && REG_P (operands[0]) && GP_REGNUM_P (REGNO (operands[0]))"
    [(const_int 0)]
    "{
@@ -947,8 +948,9 @@
        DONE;
     }"
   [(set_attr "type" "mov_reg,mov_reg,mov_reg,mov_imm,mov_imm,load1,load1,store1,store1,\
-                     adr,adr,f_mcr,f_mrc,fmov")
-   (set_attr "fp" "*,*,*,*,*,*,yes,*,yes,*,*,yes,yes,yes")]
+		    adr,adr,f_mcr,f_mrc,fmov,neon_move")
+   (set_attr "fp" "*,*,*,*,*,*,yes,*,yes,*,*,yes,yes,yes,*")
+   (set_attr "simd" "*,*,*,*,*,*,*,*,*,*,*,*,*,*,yes")]
 )
 
 (define_insn_and_split "*movdi_aarch64"
@@ -971,7 +973,7 @@
    fmov\\t%d0, %x1
    fmov\\t%x0, %d1
    fmov\\t%d0, %d1
-   movi\\t%d0, %1"
+   * return aarch64_output_scalar_simd_mov_immediate (operands[1], DImode);"
    "(CONST_INT_P (operands[1]) && !aarch64_move_imm (INTVAL (operands[1]), DImode))
     && REG_P (operands[0]) && GP_REGNUM_P (REGNO (operands[0]))"
    [(const_int 0)]
diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md
index 88e840f2898d2da3e51e753578ee59bce4f462fa..9ce3d4efaf31a301dfb7c1772a6b685fb2cbd2ee 100644
--- a/gcc/config/aarch64/constraints.md
+++ b/gcc/config/aarch64/constraints.md
@@ -176,6 +176,12 @@
   (and (match_code "const_double")
        (match_test "aarch64_float_const_representable_p (op)")))
 
+(define_constraint "Uvi"
+  "A floating point constant which can be used with a\
+   MOVI immediate operation."
+  (and (match_code "const_double")
+       (match_test "aarch64_can_const_movi_rtx_p (op, GET_MODE (op))")))
+
 (define_constraint "Dn"
   "@internal
  A constraint that matches vector of immediates."
@@ -220,9 +226,17 @@
 
 (define_constraint "Dd"
   "@internal
- A constraint that matches an immediate operand valid for AdvSIMD scalar."
+ A constraint that matches an integer immediate operand valid\
+ for AdvSIMD scalar operations in DImode."
+ (and (match_code "const_int")
+      (match_test "aarch64_can_const_movi_rtx_p (op, DImode)")))
+
+(define_constraint "Ds"
+  "@internal
+ A constraint that matches an integer immediate operand valid\
+ for AdvSIMD scalar operations in SImode."
  (and (match_code "const_int")
-      (match_test "aarch64_simd_imm_scalar_p (op, GET_MODE (op))")))
+      (match_test "aarch64_can_const_movi_rtx_p (op, SImode)")))
 
 (define_address_constraint "Dp"
   "@internal