From patchwork Fri Apr 26 13:25:21 2013
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: James Greenhalgh <james.greenhalgh@arm.com>
X-Patchwork-Id: 239873
Return-Path: 
 <gcc-patches-return-340622-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified))
	by ozlabs.org (Postfix) with ESMTPS id 4C8D22C00CA
	for <incoming@patchwork.ozlabs.org>;
	Fri, 26 Apr 2013 23:25:49 +1000 (EST)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender:from
	:to:cc:subject:date:message-id:mime-version:content-type; q=dns;
	s=default; b=lBGdQvorhYvt+GhVA0MaRaqtpM/f5mYx9hLdxxtFvxyDyhE3Ih
	wCi4bg4uQCXNx4E9ySCC6XcU8nHbXHlsLjt/GKGnAz0tOlG0HNUkpqw1xu9+BMiN
	+0uBB8N/ej4Tdf4cHARt81xw9LElLRnrH+g5x7Rmye6OjGCzVwADqyufw=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender:from
	:to:cc:subject:date:message-id:mime-version:content-type; s=
	default; bh=nBuKS8lchr9xCBjqq2dyG7cPrkU=; b=a7o67ytxgeBmgqaICJSu
	y99gXzs+3sgDgjIXxUFD4wLL2xXof3PiPp+BYkFEytlCsz/mXODSXC/br69XmdGj
	OH8vzu3/coAJNEmAo1CpkWc7X4zdAZq3eXYAeDfH115FRPIf9lH+18M5nroerVLE
	Cqx+/+Kj3dAeeJz/SZB1iHs=
Received: (qmail 30378 invoked by alias); 26 Apr 2013 13:25:29 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <mailto:gcc-patches-unsubscribe-##L=##H@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 30356 invoked by uid 89); 26 Apr 2013 13:25:29 -0000
X-Spam-SWARE-Status: No, score=-2.6 required=5.0 tests=AWL, BAYES_00,
	RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.1
Received: from service87.mimecast.com (HELO service87.mimecast.com)
	(91.220.42.44) by sourceware.org
	(qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP;
	Fri, 26 Apr 2013 13:25:29 +0000
Received: from cam-owa2.Emea.Arm.com (fw-tnat.cambridge.arm.com
	[217.140.96.21]) by service87.mimecast.com;
	Fri, 26 Apr 2013 14:25:26 +0100
Received: from e106375-lin.cambridge.arm.com ([10.1.255.212]) by
	cam-owa2.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959);
	Fri, 26 Apr 2013 14:25:25 +0100
From: James Greenhalgh <james.greenhalgh@arm.com>
To: gcc-patches@gcc.gnu.org
Cc: marcus.shawcroft@arm.com
Subject: [AArch64] Implement vector float->double widening and double->float
	narrowing.
Date: Fri, 26 Apr 2013 14:25:21 +0100
Message-Id: <1366982721-22789-1-git-send-email-james.greenhalgh@arm.com>
MIME-Version: 1.0
X-MC-Unique: 113042614252616601
X-Virus-Found: No

Hi,

gcc.dg/vect/vect-float-truncate-1.c and
gcc.dg/vect/vect-float-extend-1.c

Were failing because widening and narrowing of floats to doubles was
not wired up.

This patch fixes that by implementing the standard names:

vec_pack_trunc_v2df
Taking two vectors of V2DFmode and returning one vector of V4SF mode.

`vec_unpacks_float_hi_v4sf', `vec_unpacks_float_lo_v4sf'
Taking one vector of V4SF mode and splitting it to two vectors of V2DF mode.

Patch regression tested on aarch64-none-elf with no regressions,
and shown to fix the bug.

Thanks,
James
---
gcc/

2013-04-26  James Greenhalgh  <james.greenhalgh@arm.com>

	* config/aarch64/aarch64-simd-builtins.def (vec_unpacks_hi_): New.
	(float_truncate_hi_): Likewise.
	(float_extend_lo_): Likewise.
	(float_truncate_lo_): Likewise.
	* config/aarch64/aarch64-simd.md (vec_unpacks_lo_v4sf): New.
	(aarch64_float_extend_lo_v2df): Likewise.
	(vec_unpacks_hi_v4sf): Likewise.
	(aarch64_float_truncate_lo_v2sf): Likewise.
	(aarch64_float_truncate_hi_v4sf): Likewise.
	(vec_pack_trunc_v2df): Likewise.
	(vec_pack_trunc_df): Likewise.

diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index 029e091..2aa9877 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -338,3 +338,9 @@
   BUILTIN_VDQF (BINOP, frecps, 0)
 
   BUILTIN_VDQF (UNOP, abs, 2)
+
+  VAR1 (UNOP, vec_unpacks_hi_, 10, v4sf)
+  VAR1 (BINOP, float_truncate_hi_, 0, v4sf)
+
+  VAR1 (UNOP, float_extend_lo_, 0, v2df)
+  VAR1 (UNOP, float_truncate_lo_, 0, v2sf)
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 067c849..4546094 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1267,6 +1267,108 @@
    (set_attr "simd_mode" "<MODE>")]
 )
 
+;; Conversions between vectors of floats and doubles.
+;; Contains a mix of patterns to match standard pattern names
+;; and those for intrinsics.
+
+;; Float widening operations.
+
+(define_insn "vec_unpacks_lo_v4sf"
+  [(set (match_operand:V2DF 0 "register_operand" "=w")
+	(float_extend:V2DF
+	  (vec_select:V2SF
+	    (match_operand:V4SF 1 "register_operand" "w")
+	    (parallel [(const_int 0) (const_int 1)])
+	  )))]
+  "TARGET_SIMD"
+  "fcvtl\\t%0.2d, %1.2s"
+  [(set_attr "simd_type" "simd_fcvtl")
+   (set_attr "simd_mode" "V2DF")]
+)
+
+(define_insn "aarch64_float_extend_lo_v2df"
+  [(set (match_operand:V2DF 0 "register_operand" "=w")
+	(float_extend:V2DF
+	  (match_operand:V2SF 1 "register_operand" "w")))]
+  "TARGET_SIMD"
+  "fcvtl\\t%0.2d, %1.2s"
+  [(set_attr "simd_type" "simd_fcvtl")
+   (set_attr "simd_mode" "V2DF")]
+)
+
+(define_insn "vec_unpacks_hi_v4sf"
+  [(set (match_operand:V2DF 0 "register_operand" "=w")
+	(float_extend:V2DF
+	  (vec_select:V2SF
+	    (match_operand:V4SF 1 "register_operand" "w")
+	    (parallel [(const_int 2) (const_int 3)])
+	  )))]
+  "TARGET_SIMD"
+  "fcvtl2\\t%0.2d, %1.4s"
+  [(set_attr "simd_type" "simd_fcvtl")
+   (set_attr "simd_mode" "V2DF")]
+)
+
+;; Float narrowing operations.
+
+(define_insn "aarch64_float_truncate_lo_v2sf"
+  [(set (match_operand:V2SF 0 "register_operand" "=w")
+      (float_truncate:V2SF
+	(match_operand:V2DF 1 "register_operand" "w")))]
+  "TARGET_SIMD"
+  "fcvtn\\t%0.2s, %1.2d"
+  [(set_attr "simd_type" "simd_fcvtl")
+   (set_attr "simd_mode" "V2SF")]
+)
+
+(define_insn "aarch64_float_truncate_hi_v4sf"
+  [(set (match_operand:V4SF 0 "register_operand" "=w")
+    (vec_concat:V4SF
+      (match_operand:V2SF 1 "register_operand" "0")
+      (float_truncate:V2SF
+	(match_operand:V2DF 2 "register_operand" "w"))))]
+  "TARGET_SIMD"
+  "fcvtn2\\t%0.4s, %2.2d"
+  [(set_attr "simd_type" "simd_fcvtl")
+   (set_attr "simd_mode" "V4SF")]
+)
+
+(define_expand "vec_pack_trunc_v2df"
+  [(set (match_operand:V4SF 0 "register_operand")
+      (vec_concat:V4SF
+	(float_truncate:V2SF
+	    (match_operand:V2DF 1 "register_operand"))
+	(float_truncate:V2SF
+	    (match_operand:V2DF 2 "register_operand"))
+	  ))]
+  "TARGET_SIMD"
+  {
+    rtx tmp = gen_reg_rtx (V2SFmode);
+    emit_insn (gen_aarch64_float_truncate_lo_v2sf (tmp, operands[1]));
+    emit_insn (gen_aarch64_float_truncate_hi_v4sf (operands[0],
+						   tmp, operands[2]));
+    DONE;
+  }
+)
+
+(define_expand "vec_pack_trunc_df"
+  [(set (match_operand:V2SF 0 "register_operand")
+      (vec_concat:V2SF
+	(float_truncate:SF
+	    (match_operand:DF 1 "register_operand"))
+	(float_truncate:SF
+	    (match_operand:DF 2 "register_operand"))
+	  ))]
+  "TARGET_SIMD"
+  {
+    rtx tmp = gen_reg_rtx (V2SFmode);
+    emit_insn (gen_move_lo_quad_v2df (tmp, operands[1]));
+    emit_insn (gen_move_hi_quad_v2df (tmp, operands[2]));
+    emit_insn (gen_aarch64_float_truncate_lo_v2sf (operands[0], tmp));
+    DONE;
+  }
+)
+
 (define_insn "aarch64_vmls<mode>"
   [(set (match_operand:VDQF 0 "register_operand" "=w")
        (minus:VDQF (match_operand:VDQF 1 "register_operand" "0")