From patchwork Tue Feb 19 21:03:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul A. Clarke" X-Patchwork-Id: 1044895 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-496662-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=us.ibm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="qVr6tknl"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 443tX35TXrz9s1l for ; Wed, 20 Feb 2019 08:04:17 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to:cc :from:subject:date:mime-version:message-id:content-type :content-transfer-encoding; q=dns; s=default; b=lN8ChbKfhwCTw90z PLYpFHbGfFMQKbCb2f5P6+yuxem61k64c/6ZOV4KztVyewBHmgAmPyvErsMXxRN/ tQffDskjlR4OobJ4K9wI7W7reGL/JLDb95j29JkzDr4Q+Y7/8AZcpQjpvVDogA23 EXuK88if4/NbOI0tX+GgtS31Hn4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to:cc :from:subject:date:mime-version:message-id:content-type :content-transfer-encoding; s=default; bh=w9vamMdVKG5jqJydwqNTs1 00X/4=; b=qVr6tknlpVKOeK0hUNysv8PkR2I9x9xFrZ5lbRFzU7LUDxAnGy8z7o 9QJCZg+E/4RhMhJQ+c3TW1D8HJQhPU2gyCsBxEUL6pjxS42oJHZ7iKPRNNSRWN4t 8FzqC6GSocv7zuLi9WTLaKKiQR0TerCqCT2Yjvbe2zsUiESCv4+nI= Received: (qmail 3759 invoked by alias); 19 Feb 2019 21:04:10 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 3719 invoked by uid 89); 19 Feb 2019 21:04:09 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-20.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 spammy= X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0b-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.158.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 19 Feb 2019 21:04:08 +0000 Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x1JL3XkN118188 for ; Tue, 19 Feb 2019 16:04:06 -0500 Received: from e32.co.us.ibm.com (e32.co.us.ibm.com [32.97.110.150]) by mx0b-001b2d01.pphosted.com with ESMTP id 2qrrkw1yes-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 19 Feb 2019 16:04:05 -0500 Received: from localhost by e32.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 19 Feb 2019 21:04:04 -0000 Received: from b03cxnp07028.gho.boulder.ibm.com (9.17.130.15) by e32.co.us.ibm.com (192.168.1.132) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 19 Feb 2019 21:04:02 -0000 Received: from b03ledav001.gho.boulder.ibm.com (b03ledav001.gho.boulder.ibm.com [9.17.130.232]) by b03cxnp07028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x1JL40GD17301616 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 19 Feb 2019 21:04:01 GMT Received: from b03ledav001.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CFBFB6E05E; Tue, 19 Feb 2019 21:04:00 +0000 (GMT) Received: from b03ledav001.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 01E076E04E; Tue, 19 Feb 2019 21:03:59 +0000 (GMT) Received: from oc3272150783.ibm.com (unknown [9.85.174.97]) by b03ledav001.gho.boulder.ibm.com (Postfix) with ESMTPS; Tue, 19 Feb 2019 21:03:59 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: Segher Boessenkool From: Paul Clarke Subject: [PATCH v2][rs6000] PR89338, PR89339: Fix compat vector intrinsics for BE and 32-bit Date: Tue, 19 Feb 2019 15:03:58 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 x-cbid: 19021921-0004-0000-0000-000014E453D8 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00010627; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000281; SDB=6.01163462; UDB=6.00607491; IPR=6.00944045; MB=3.00025660; MTD=3.00000008; XFM=3.00000015; UTC=2019-02-19 21:04:03 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19021921-0005-0000-0000-00008AA7DAFF Message-Id: <50b726a8-5857-3cd1-0d3b-a08e0e13fdf9@us.ibm.com> Test FAILS: sse2-cvtpd2dq-1, sse2-cvtpd2ps, sse2-cvttpd2dq on powerpc64 (big-endian). _mm_cvtpd_epi32, _mm_cvtpd_ps, _mm_cvttpd_epi32: Type conversion from vector doubleword type to vector word type leaves the results in even lanes in big endian mode. Test FAILS: sse-cvtss2si-1, sse-cvtss2si-2, sse-movmskb-1 on powerpc (32-bit big-endian). Incorrect type for interpreting the result from mfvsrd instruction leads to incorrect results. Also, mfvsrd instruction only works as expected in 64-bit mode or for 32-bit quantities in 32-bit mode. A more general, if slower, solution is needed for 32-bit mode. 2019-02-19 Paul A. Clarke [gcc] * config/rs6000/emmintrin.h (_mm_cvtpd_epi32): Fix big endian. (_mm_cvtpd_ps): Likewise. (_mm_cvttpd_epi32): Likewise. PR89338 * config/rs6000/xmmintrin.h (_mm_cvtss_f32): Fix type mismatch. (_mm_cvt_ss2si): Fix type mismatch and 32-bit. PR89339 * config/rs6000/xmmintrin.h (_mm_movemask_pi8): Fix 32-bit. --- v2: more elegant solution for the 32-bit mode fix in _mm_movemask_pi8, as suggested by Segher. Index: gcc/config/rs6000/emmintrin.h =================================================================== diff --git a/trunk/gcc/config/rs6000/emmintrin.h b/trunk/gcc/config/rs6000/emmintrin.h --- a/trunk/gcc/config/rs6000/emmintrin.h (revision 268997) +++ b/trunk/gcc/config/rs6000/emmintrin.h (working copy) @@ -887,7 +887,11 @@ _mm_cvtpd_epi32 (__m128d __A) : ); #ifdef _ARCH_PWR8 +#ifdef __LITTLE_ENDIAN__ temp = vec_mergeo (temp, temp); +#else + temp = vec_mergee (temp, temp); +#endif result = (__v4si) vec_vpkudum ((__vector long long) temp, (__vector long long) vzero); #else @@ -922,7 +926,11 @@ _mm_cvtpd_ps (__m128d __A) : ); #ifdef _ARCH_PWR8 +#ifdef __LITTLE_ENDIAN__ temp = vec_mergeo (temp, temp); +#else + temp = vec_mergee (temp, temp); +#endif result = (__v4sf) vec_vpkudum ((__vector long long) temp, (__vector long long) vzero); #else @@ -951,7 +959,11 @@ _mm_cvttpd_epi32 (__m128d __A) : ); #ifdef _ARCH_PWR8 +#ifdef __LITTLE_ENDIAN__ temp = vec_mergeo (temp, temp); +#else + temp = vec_mergee (temp, temp); +#endif result = (__v4si) vec_vpkudum ((__vector long long) temp, (__vector long long) vzero); #else Index: gcc/config/rs6000/xmmintrin.h =================================================================== diff --git a/trunk/gcc/config/rs6000/xmmintrin.h b/trunk/gcc/config/rs6000/xmmintrin.h --- a/trunk/gcc/config/rs6000/xmmintrin.h (revision 268997) +++ b/trunk/gcc/config/rs6000/xmmintrin.h (working copy) @@ -905,7 +905,7 @@ _mm_cvtss_f32 (__m128 __A) extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_cvtss_si32 (__m128 __A) { - __m64 res = 0; + int res; #ifdef _ARCH_PWR8 double dtmp; __asm__( @@ -938,8 +938,8 @@ _mm_cvt_ss2si (__m128 __A) extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_cvtss_si64 (__m128 __A) { - __m64 res = 0; -#ifdef _ARCH_PWR8 + long long res; +#if defined (_ARCH_PWR8) && defined (__powerpc64__) double dtmp; __asm__( #ifdef __LITTLE_ENDIAN__ @@ -1577,6 +1577,7 @@ _m_pminub (__m64 __A, __m64 __B) extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_movemask_pi8 (__m64 __A) { +#ifdef __powerpc64__ unsigned long long p = #ifdef __LITTLE_ENDIAN__ 0x0008101820283038UL; // permute control for sign bits @@ -1584,6 +1585,12 @@ _mm_movemask_pi8 (__m64 __A) 0x3830282018100800UL; // permute control for sign bits #endif return __builtin_bpermd (p, __A); +#else + unsigned int mask = 0x20283038UL; + unsigned int r1 = __builtin_bpermd (mask, __A) & 0xf; + unsigned int r2 = __builtin_bpermd (mask, __A >> 32) & 0xf; + return (r2 << 4) | r1; +#endif } extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))