From patchwork Thu Jan 3 11:35:18 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejas Belagod X-Patchwork-Id: 209226 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 9F3392C0080 for ; Thu, 3 Jan 2013 22:35:35 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1357817736; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject: Content-Type:Mailing-List:Precedence:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:Sender:Delivered-To; bh=hnyRfEw LtW5Ks7Wj+pCgDA+wcqk=; b=ytUvYoGPNlwKgU+1gmmQL+PvhtKUF6pWJ/UWaRC 1sjC6jm4/XDBjpxgy0LJIBLqRVDKHsZQoMnxPc+WNkPCoD96EiB2KK7xgSf1HZ8B 9ZMSefR1jnRLZcutJGjML8F6/Nfffg46LI4dx0MURJi3MRtDULce9FVQhZHk9RQT 0I44= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject:X-MC-Unique:Content-Type:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=vSJfotWflMhIEShDiN3Z3iWJ4Gf35AkTzjEqYyCPlJI4puVv2uEMP8guXfPFbt IhrEicGqCdxANUgNvyTa3S8W/a8GM8ubSg/Ix9bZg3BI9Bkma5mGAYW//KwVMq7T dNxPpeVB2aioZocVBHXA+7/yWY/4INQ/i2/T3OzXIRY/Q=; Received: (qmail 1770 invoked by alias); 3 Jan 2013 11:35:31 -0000 Received: (qmail 1445 invoked by uid 22791); 3 Jan 2013 11:35:30 -0000 X-SWARE-Spam-Status: No, hits=-2.1 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, KHOP_SPAMHAUS_DROP, RCVD_IN_DNSWL_LOW, SARE_SUB_OBFU_Q0, SARE_SUB_OBFU_Q1 X-Spam-Check-By: sourceware.org Received: from service87.mimecast.com (HELO service87.mimecast.com) (91.220.42.44) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 03 Jan 2013 11:35:23 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.21]) by service87.mimecast.com; Thu, 03 Jan 2013 11:35:21 +0000 Received: from [10.1.79.66] ([10.1.255.212]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.0); Thu, 3 Jan 2013 11:35:19 +0000 Message-ID: <50E56CF6.2000809@arm.com> Date: Thu, 03 Jan 2013 11:35:18 +0000 From: Tejas Belagod User-Agent: Thunderbird 2.0.0.18 (X11/20081120) MIME-Version: 1.0 To: "gcc-patches@gcc.gnu.org" CC: Marcus Shawcroft Subject: [Patch, AArch64] Fix vmovn_high_*, vqmovn_high_* and vqmovun_high_* intrinsics. X-MC-Unique: 113010311352102701 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi, Attached is a patch that fixes bugs in intrinsic implementation of vmovn_high_*, vqmovn_high_* and vqmovun_high_* in arm_neon.h. This runtime bug was because of xtn2 having the incorrect operand number for the source operand. Tested on aarch64-none-elf. OK for trunk and 4.7? Thanks, Tejas Belagod ARM. 2013-01-03 Tejas Belagod gcc/ * config/aarch64/arm_neon.h (vmovn_high_*, vqmovn_high_*, vqmovun_high_*): Fix source operand number. diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index e8fafa6..c7f4323 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -11647,7 +11647,7 @@ __extension__ static __inline int8x16_t __attribute__ ((__always_inline__)) vmovn_high_s16 (int8x8_t a, int16x8_t b) { int8x16_t result = vcombine_s8 (a, vcreate_s8 (UINT64_C (0x0))); - __asm__ ("xtn2 %0.16b,%2.8h" + __asm__ ("xtn2 %0.16b,%1.8h" : "+w"(result) : "w"(b) : /* No clobbers */); @@ -11658,7 +11658,7 @@ __extension__ static __inline int16x8_t __attribute__ ((__always_inline__)) vmovn_high_s32 (int16x4_t a, int32x4_t b) { int16x8_t result = vcombine_s16 (a, vcreate_s16 (UINT64_C (0x0))); - __asm__ ("xtn2 %0.8h,%2.4s" + __asm__ ("xtn2 %0.8h,%1.4s" : "+w"(result) : "w"(b) : /* No clobbers */); @@ -11669,7 +11669,7 @@ __extension__ static __inline int32x4_t __attribute__ ((__always_inline__)) vmovn_high_s64 (int32x2_t a, int64x2_t b) { int32x4_t result = vcombine_s32 (a, vcreate_s32 (UINT64_C (0x0))); - __asm__ ("xtn2 %0.4s,%2.2d" + __asm__ ("xtn2 %0.4s,%1.2d" : "+w"(result) : "w"(b) : /* No clobbers */); @@ -11680,7 +11680,7 @@ __extension__ static __inline uint8x16_t __attribute__ ((__always_inline__)) vmovn_high_u16 (uint8x8_t a, uint16x8_t b) { uint8x16_t result = vcombine_u8 (a, vcreate_u8 (UINT64_C (0x0))); - __asm__ ("xtn2 %0.16b,%2.8h" + __asm__ ("xtn2 %0.16b,%1.8h" : "+w"(result) : "w"(b) : /* No clobbers */); @@ -11691,7 +11691,7 @@ __extension__ static __inline uint16x8_t __attribute__ ((__always_inline__)) vmovn_high_u32 (uint16x4_t a, uint32x4_t b) { uint16x8_t result = vcombine_u16 (a, vcreate_u16 (UINT64_C (0x0))); - __asm__ ("xtn2 %0.8h,%2.4s" + __asm__ ("xtn2 %0.8h,%1.4s" : "+w"(result) : "w"(b) : /* No clobbers */); @@ -11702,7 +11702,7 @@ __extension__ static __inline uint32x4_t __attribute__ ((__always_inline__)) vmovn_high_u64 (uint32x2_t a, uint64x2_t b) { uint32x4_t result = vcombine_u32 (a, vcreate_u32 (UINT64_C (0x0))); - __asm__ ("xtn2 %0.4s,%2.2d" + __asm__ ("xtn2 %0.4s,%1.2d" : "+w"(result) : "w"(b) : /* No clobbers */); @@ -14121,7 +14121,7 @@ __extension__ static __inline int8x16_t __attribute__ ((__always_inline__)) vqmovn_high_s16 (int8x8_t a, int16x8_t b) { int8x16_t result = vcombine_s8 (a, vcreate_s8 (UINT64_C (0x0))); - __asm__ ("sqxtn2 %0.16b, %2.8h" + __asm__ ("sqxtn2 %0.16b, %1.8h" : "+w"(result) : "w"(b) : /* No clobbers */); @@ -14132,7 +14132,7 @@ __extension__ static __inline int16x8_t __attribute__ ((__always_inline__)) vqmovn_high_s32 (int16x4_t a, int32x4_t b) { int16x8_t result = vcombine_s16 (a, vcreate_s16 (UINT64_C (0x0))); - __asm__ ("sqxtn2 %0.8h, %2.4s" + __asm__ ("sqxtn2 %0.8h, %1.4s" : "+w"(result) : "w"(b) : /* No clobbers */); @@ -14143,7 +14143,7 @@ __extension__ static __inline int32x4_t __attribute__ ((__always_inline__)) vqmovn_high_s64 (int32x2_t a, int64x2_t b) { int32x4_t result = vcombine_s32 (a, vcreate_s32 (UINT64_C (0x0))); - __asm__ ("sqxtn2 %0.4s, %2.2d" + __asm__ ("sqxtn2 %0.4s, %1.2d" : "+w"(result) : "w"(b) : /* No clobbers */); @@ -14154,7 +14154,7 @@ __extension__ static __inline uint8x16_t __attribute__ ((__always_inline__)) vqmovn_high_u16 (uint8x8_t a, uint16x8_t b) { uint8x16_t result = vcombine_u8 (a, vcreate_u8 (UINT64_C (0x0))); - __asm__ ("uqxtn2 %0.16b, %2.8h" + __asm__ ("uqxtn2 %0.16b, %1.8h" : "+w"(result) : "w"(b) : /* No clobbers */); @@ -14165,7 +14165,7 @@ __extension__ static __inline uint16x8_t __attribute__ ((__always_inline__)) vqmovn_high_u32 (uint16x4_t a, uint32x4_t b) { uint16x8_t result = vcombine_u16 (a, vcreate_u16 (UINT64_C (0x0))); - __asm__ ("uqxtn2 %0.8h, %2.4s" + __asm__ ("uqxtn2 %0.8h, %1.4s" : "+w"(result) : "w"(b) : /* No clobbers */); @@ -14176,7 +14176,7 @@ __extension__ static __inline uint32x4_t __attribute__ ((__always_inline__)) vqmovn_high_u64 (uint32x2_t a, uint64x2_t b) { uint32x4_t result = vcombine_u32 (a, vcreate_u32 (UINT64_C (0x0))); - __asm__ ("uqxtn2 %0.4s, %2.2d" + __asm__ ("uqxtn2 %0.4s, %1.2d" : "+w"(result) : "w"(b) : /* No clobbers */); @@ -14187,7 +14187,7 @@ __extension__ static __inline uint8x16_t __attribute__ ((__always_inline__)) vqmovun_high_s16 (uint8x8_t a, int16x8_t b) { uint8x16_t result = vcombine_u8 (a, vcreate_u8 (UINT64_C (0x0))); - __asm__ ("sqxtun2 %0.16b, %2.8h" + __asm__ ("sqxtun2 %0.16b, %1.8h" : "+w"(result) : "w"(b) : /* No clobbers */); @@ -14198,7 +14198,7 @@ __extension__ static __inline uint16x8_t __attribute__ ((__always_inline__)) vqmovun_high_s32 (uint16x4_t a, int32x4_t b) { uint16x8_t result = vcombine_u16 (a, vcreate_u16 (UINT64_C (0x0))); - __asm__ ("sqxtun2 %0.8h, %2.4s" + __asm__ ("sqxtun2 %0.8h, %1.4s" : "+w"(result) : "w"(b) : /* No clobbers */); @@ -14209,7 +14209,7 @@ __extension__ static __inline uint32x4_t __attribute__ ((__always_inline__)) vqmovun_high_s64 (uint32x2_t a, int64x2_t b) { uint32x4_t result = vcombine_u32 (a, vcreate_u32 (UINT64_C (0x0))); - __asm__ ("sqxtun2 %0.4s, %2.2d" + __asm__ ("sqxtun2 %0.4s, %1.2d" : "+w"(result) : "w"(b) : /* No clobbers */);