From patchwork Fri Apr 13 00:07:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul A. Clarke" X-Patchwork-Id: 897823 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-476323-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=us.ibm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="hSI2AJlk"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40MdQ53YT0z9s19 for ; Fri, 13 Apr 2018 10:07:38 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:date:mime-version:content-type :content-transfer-encoding:message-id; q=dns; s=default; b=DsYob lL+y22GLC/bwefKdiq++HUwXxMhjFpxI/KT0Nx+Fna3o4NYU+/5AndXEZTHlre/7 MOwm0mzY1y7nMF5jhh/1Z/99zIUH0yMZloRX5MM1sqntyW7wjEacwNFqqih9ITb7 jQcr+JFhvBVjghio0WMEJ8NTibZspmFxbo1atQ= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:date:mime-version:content-type :content-transfer-encoding:message-id; s=default; bh=/VEVLR0/1GC baKNE7X/CHTtMEII=; b=hSI2AJlkwo8y+YYssGy6QNGqyXgUVwKJaiLwMAVsl5T Mt+OrAuoZN4XF+OOvYzeqr8YWPGDQ/hKGDfEi62devjBegvUGr7gjVZuuasEqTi6 LRbm+yUUs05qYCc3R/8BcgX8N3xjlUK7ok0fI0D7ILwvGOwk4eqCYb1LyiB1WcBk = Received: (qmail 51991 invoked by alias); 13 Apr 2018 00:07:31 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 51981 invoked by uid 89); 13 Apr 2018 00:07:31 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-10.8 required=5.0 tests=BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 spammy=ux X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0b-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.158.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 13 Apr 2018 00:07:28 +0000 Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w3CNxdAa147031 for ; Thu, 12 Apr 2018 20:07:26 -0400 Received: from e14.ny.us.ibm.com (e14.ny.us.ibm.com [129.33.205.204]) by mx0b-001b2d01.pphosted.com with ESMTP id 2hagdaaje8-1 (version=TLSv1.2 cipher=AES256-SHA256 bits=256 verify=NOT) for ; Thu, 12 Apr 2018 20:07:26 -0400 Received: from localhost by e14.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 12 Apr 2018 20:07:25 -0400 Received: from b01cxnp23032.gho.pok.ibm.com (9.57.198.27) by e14.ny.us.ibm.com (146.89.104.201) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Thu, 12 Apr 2018 20:07:22 -0400 Received: from b01ledav001.gho.pok.ibm.com (b01ledav001.gho.pok.ibm.com [9.57.199.106]) by b01cxnp23032.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w3D07Mfh47382766 for ; Fri, 13 Apr 2018 00:07:22 GMT Received: from b01ledav001.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DB7212803E for ; Thu, 12 Apr 2018 20:06:45 -0400 (EDT) Received: from oc3272150783.ibm.com (unknown [9.85.133.24]) by b01ledav001.gho.pok.ibm.com (Postfix) with ESMTPS id A59F02803A for ; Thu, 12 Apr 2018 20:06:45 -0400 (EDT) From: Paul Clarke Subject: [PATCH, rs6000] (PR84302) Fix _mm_slli_epi{32, 64} for shift values 16 through 31 and negative To: gcc-patches@gcc.gnu.org Date: Thu, 12 Apr 2018 19:07:21 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 18041300-0052-0000-0000-000002DA2150 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00008845; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000256; SDB=6.01017044; UDB=6.00518689; IPR=6.00796281; MB=3.00020545; MTD=3.00000008; XFM=3.00000015; UTC=2018-04-13 00:07:23 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18041300-0053-0000-0000-00005C4FBB7C Message-Id: X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-04-12_13:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1804120225 The powerpc versions of _mm_slli_epi32 and __mm_slli_epi64 in emmintrin.h do not properly handle shift values between 16 and 31, inclusive. These were setting up the shift with vec_splat_s32, which only accepts *5 bit signed* shift values, or a range of -16 to 15. Values above 15 produced an error: error: argument 1 must be a 5-bit signed literal Fix is to effectively reduce the range for which vec_splat_s32 is used to < 32 and use vec_splats otherwise. Also, __mm_slli_epi{16,32,64}, when given a negative shift value, should always return a vector of {0}. 2018-04-12 Paul A. Clarke gcc/config PR target/83402 * rs6000/emmintrin.h (_mm_slli_epi{16,32,64}): Ensure that vec_splat_s32 is only called with 0 < shift < 16. Ensure negative shifts result in {0}. gcc/testsuite/gcc.target/powerpc PR target/83402 * gcc.target/powerpc/sse2-psllw-1.c: Refactor and add tests for several positive and negative values. * gcc.target/powerpc/sse2-pslld-1.c: Same. * gcc.target/powerpc/sse2-psllq-1.c: Same. --- Paul Clarke, IBM Index: gcc/config/rs6000/emmintrin.h =================================================================== --- gcc/config/rs6000/emmintrin.h (revision 259016) +++ gcc/config/rs6000/emmintrin.h (working copy) @@ -1488,7 +1488,7 @@ _mm_slli_epi16 (__m128i __A, int __B) __v8hu lshift; __v8hi result = { 0, 0, 0, 0, 0, 0, 0, 0 }; - if (__B < 16) + if (__B > 0 && __B < 16) { if (__builtin_constant_p(__B)) lshift = (__v8hu) vec_splat_s16(__B); @@ -1507,12 +1507,12 @@ _mm_slli_epi32 (__m128i __A, int __B) __v4su lshift; __v4si result = { 0, 0, 0, 0 }; - if (__B < 32) + if (__B > 0 && __B < 32) { - if (__builtin_constant_p(__B)) - lshift = (__v4su) vec_splat_s32(__B); + if (__builtin_constant_p(__B) && __B < 16) + lshift = (__v4su) vec_splat_s32(__B); else - lshift = vec_splats ((unsigned int) __B); + lshift = vec_splats ((unsigned int) __B); result = vec_vslw ((__v4si) __A, lshift); } @@ -1527,17 +1527,12 @@ _mm_slli_epi64 (__m128i __A, int __B) __v2du lshift; __v2di result = { 0, 0 }; - if (__B < 64) + if (__B > 0 && __B < 64) { - if (__builtin_constant_p(__B)) - { - if (__B < 32) - lshift = (__v2du) vec_splat_s32(__B); - else - lshift = (__v2du) vec_splats((unsigned long long)__B); - } + if (__builtin_constant_p(__B) && __B < 16) + lshift = (__v2du) vec_splat_s32(__B); else - lshift = (__v2du) vec_splats ((unsigned int) __B); + lshift = (__v2du) vec_splats ((unsigned int) __B); result = vec_vsld ((__v2di) __A, lshift); } Index: gcc/testsuite/gcc.target/powerpc/sse2-pslld-1.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/sse2-pslld-1.c (revision 259016) +++ gcc/testsuite/gcc.target/powerpc/sse2-pslld-1.c (working copy) @@ -13,32 +13,50 @@ #define TEST sse2_test_pslld_1 #endif -#define N 0xf - #include -static __m128i -__attribute__((noinline, unused)) -test (__m128i s1) -{ - return _mm_slli_epi32 (s1, N); -} +#define TEST_FUNC(id, N) \ + static __m128i \ + __attribute__((noinline, unused)) \ + test##id (__m128i s1) \ + { \ + return _mm_slli_epi32 (s1, N); \ + } +TEST_FUNC(0, 0) +TEST_FUNC(15, 15) +TEST_FUNC(16, 16) +TEST_FUNC(31, 31) +TEST_FUNC(neg1, -1) +TEST_FUNC(neg16, -16) +TEST_FUNC(neg32, -32) +TEST_FUNC(neg64, -64) +TEST_FUNC(neg128, -128) + +#define TEST_CODE(id, N) \ + { \ + int e[4] = {0}; \ + union128i_d u, s; \ + int i; \ + s.x = _mm_set_epi32 (1, -2, 3, 4); \ + u.x = test##id (s.x); \ + if (N > 0 && N < 32) \ + for (i = 0; i < 4; i++) \ + e[i] = s.a[i] << (N * (N > 0)); \ + if (check_union128i_d (u, e)) \ + abort (); \ + } + static void TEST (void) { - union128i_d u, s; - int e[4] = {0}; - int i; - - s.x = _mm_set_epi32 (1, -2, 3, 4); - - u.x = test (s.x); - - if (N < 32) - for (i = 0; i < 4; i++) - e[i] = s.a[i] << N; - - if (check_union128i_d (u, e)) - abort (); + TEST_CODE(0, 0); + TEST_CODE(15, 15); + TEST_CODE(16, 16); + TEST_CODE(31, 31); + TEST_CODE(neg1, -1); + TEST_CODE(neg16, -16); + TEST_CODE(neg32, -32); + TEST_CODE(neg64, -64); + TEST_CODE(neg128, -128); } Index: gcc/testsuite/gcc.target/powerpc/sse2-psllq-1.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/sse2-psllq-1.c (revision 259016) +++ gcc/testsuite/gcc.target/powerpc/sse2-psllq-1.c (working copy) @@ -13,36 +13,56 @@ #define TEST sse2_test_psllq_1 #endif -#define N 60 - #include #ifdef _ARCH_PWR8 -static __m128i -__attribute__((noinline, unused)) -test (__m128i s1) -{ - return _mm_slli_epi64 (s1, N); -} +#define TEST_FUNC(id, N) \ + static __m128i \ + __attribute__((noinline, unused)) \ + test##id (__m128i s1) \ + { \ + return _mm_slli_epi64 (s1, N); \ + } + +TEST_FUNC(0, 0) +TEST_FUNC(15, 15) +TEST_FUNC(16, 16) +TEST_FUNC(31, 31) +TEST_FUNC(63, 63) +TEST_FUNC(neg1, -1) +TEST_FUNC(neg16, -16) +TEST_FUNC(neg32, -32) +TEST_FUNC(neg64, -64) +TEST_FUNC(neg128, -128) #endif +#define TEST_CODE(id, N) \ + { \ + union128i_q u, s; \ + long long e[2] = {0}; \ + int i; \ + s.x = _mm_set_epi64x (-1, 0xf); \ + u.x = test##id (s.x); \ + if (N > 0 && N < 64) \ + for (i = 0; i < 2; i++) \ + e[i] = s.a[i] << (N * (N > 0)); \ + if (check_union128i_q (u, e)) \ + abort (); \ + } + static void TEST (void) { #ifdef _ARCH_PWR8 - union128i_q u, s; - long long e[2] = {0}; - int i; - - s.x = _mm_set_epi64x (-1, 0xf); - - u.x = test (s.x); - - if (N < 64) - for (i = 0; i < 2; i++) - e[i] = s.a[i] << N; - - if (check_union128i_q (u, e)) - abort (); + TEST_CODE(0, 0); + TEST_CODE(15, 15); + TEST_CODE(16, 16); + TEST_CODE(31, 31); + TEST_CODE(63, 63); + TEST_CODE(neg1, -1); + TEST_CODE(neg16, -16); + TEST_CODE(neg32, -32); + TEST_CODE(neg64, -64); + TEST_CODE(neg128, -128); #endif } Index: gcc/testsuite/gcc.target/powerpc/sse2-psllw-1.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/sse2-psllw-1.c (revision 259016) +++ gcc/testsuite/gcc.target/powerpc/sse2-psllw-1.c (working copy) @@ -13,32 +13,48 @@ #define TEST sse2_test_psllw_1 #endif -#define N 0xb - #include -static __m128i -__attribute__((noinline, unused)) -test (__m128i s1) -{ - return _mm_slli_epi16 (s1, N); -} +#define TEST_FUNC(id, N) \ + static __m128i \ + __attribute__((noinline, unused)) \ + test##id (__m128i s1) \ + { \ + return _mm_slli_epi16 (s1, N); \ + } +TEST_FUNC(0, 0) +TEST_FUNC(15, 15) +TEST_FUNC(16, 16) +TEST_FUNC(neg1, -1) +TEST_FUNC(neg16, -16) +TEST_FUNC(neg32, -32) +TEST_FUNC(neg64, -64) +TEST_FUNC(neg128, -128) + +#define TEST_CODE(id, N) \ + { \ + short e[8] = {0}; \ + union128i_w u, s; \ + int i; \ + s.x = _mm_set_epi16 (1, 2, 3, 4, 5, 6, 0x7000, 0x9000); \ + u.x = test##id (s.x); \ + if (N > 0 && N < 16) \ + for (i = 0; i < 8; i++) \ + e[i] = s.a[i] << (N * (N > 0)); \ + if (check_union128i_w (u, e)) \ + abort (); \ + } + static void TEST (void) { - union128i_w u, s; - short e[8] = {0}; - int i; - - s.x = _mm_set_epi16 (1, 2, 3, 4, 5, 6, 0x7000, 0x9000); - - u.x = test (s.x); - - if (N < 16) - for (i = 0; i < 8; i++) - e[i] = s.a[i] << N; - - if (check_union128i_w (u, e)) - abort (); + TEST_CODE(0, 0); + TEST_CODE(15, 15); + TEST_CODE(16, 16); + TEST_CODE(neg1, -1); + TEST_CODE(neg16, -16); + TEST_CODE(neg32, -32); + TEST_CODE(neg64, -64); + TEST_CODE(neg128, -128); }