From patchwork Fri Apr 13 20:21:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul A. Clarke" X-Patchwork-Id: 898090 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-476368-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=us.ibm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="fqtrG8gc"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40N8Lj0lTXz9s0t for ; Sat, 14 Apr 2018 06:21:31 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:date:mime-version:content-type :content-transfer-encoding:message-id; q=dns; s=default; b=Ebe7m apE+sSDY9H2I9k4rD+taJoV1EhpZa/8B6okg2cksLA8XLQqMf+hJoH0e77oD34uW Q/BQuzX6scLeSRkgjbn8RCxEFluhMNQ0Rhw10FT191SS1YR69Gifxkcxz5m3IhyG pS7qxi0FqaUhlmKwvutw+BPQRWUIbtVqxn6iDY= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:date:mime-version:content-type :content-transfer-encoding:message-id; s=default; bh=dXSJl8aYxI3 oxxVNjy89LZnQiKY=; b=fqtrG8gcDSW06Hv2oWKHfzz5GXUWKfYs7IeK76/+dEd xIJyFM1YjMNg/kN4/b+RkPVdp91Aq0r3MeU93JVKrsbWu8wR/CpHN84o1+AVg7Ay uW7pR+qtdgw8oOTWvuXlEKFiSEi8ch7pVj5F3mGqwd+Ymfr1zLbGn7koC71VtcCc = Received: (qmail 5193 invoked by alias); 13 Apr 2018 20:21:24 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 5182 invoked by uid 89); 13 Apr 2018 20:21:23 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-11.1 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0b-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.158.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 13 Apr 2018 20:21:21 +0000 Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w3DKJbeK123591 for ; Fri, 13 Apr 2018 16:21:20 -0400 Received: from e36.co.us.ibm.com (e36.co.us.ibm.com [32.97.110.154]) by mx0b-001b2d01.pphosted.com with ESMTP id 2haya2axgu-1 (version=TLSv1.2 cipher=AES256-SHA256 bits=256 verify=NOT) for ; Fri, 13 Apr 2018 16:21:19 -0400 Received: from localhost by e36.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 13 Apr 2018 14:21:18 -0600 Received: from b03cxnp08025.gho.boulder.ibm.com (9.17.130.17) by e36.co.us.ibm.com (192.168.1.136) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Fri, 13 Apr 2018 14:21:16 -0600 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08025.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w3DKLGRC12255730; Fri, 13 Apr 2018 13:21:16 -0700 Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E4089BE042; Fri, 13 Apr 2018 14:21:15 -0600 (MDT) Received: from oc3272150783.ibm.com (unknown [9.85.185.94]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTPS id 45681BE038; Fri, 13 Apr 2018 14:21:15 -0600 (MDT) To: gcc-patches@gcc.gnu.org, Segher Boessenkool From: Paul Clarke Subject: [PATCH v2, rs6000] (PR84302) Fix _mm_slli_epi{32, 64} for shift values 16 through 31 and negative Date: Fri, 13 Apr 2018 15:21:14 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 18041320-0020-0000-0000-00000DBC4AB0 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00008851; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000257; SDB=6.01017448; UDB=6.00518916; IPR=6.00796686; MB=3.00020564; MTD=3.00000008; XFM=3.00000015; UTC=2018-04-13 20:21:17 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18041320-0021-0000-0000-000060E55B1C Message-Id: <7583e423-6494-5288-0985-01fddc6f59f6@us.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-04-13_11:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1804130188 The powerpc versions of _mm_slli_epi32 and __mm_slli_epi64 in emmintrin.h do not properly handle shift values between 16 and 31, inclusive. These are setting up the shift with vec_splat_s32, which only accepts *5 bit signed* shift values, or a range of -16 to 15. Values above 15 produce an error: error: argument 1 must be a 5-bit signed literal Fix is to effectively reduce the range for which vec_splat_s32 is used to < 32 and use vec_splats otherwise. Also, __mm_slli_epi{16,32,64}, when given a negative shift value, should always return a vector of {0}. 2018-04-13 Paul A. Clarke Changes in v2: - fixed the "shift by 0" cases, which were being treated as negative shifts and returning {0} instead of a no-op. (Segher) - fixed the test cases to correctly test the shift-by zero cases. gcc/ PR target/83402 * config/rs6000/emmintrin.h (_mm_slli_epi{16,32,64}): Ensure that vec_splat_s32 is only called with 0 <= shift < 16. Ensure negative shifts result in {0}. gcc/testsuite/ PR target/83402 * gcc.target/powerpc/sse2-psllw-1.c: Refactor and add tests for several values: positive, negative, and zero. * gcc.target/powerpc/sse2-pslld-1.c: Same. * gcc.target/powerpc/sse2-psllq-1.c: Same. --- PC Index: gcc/config/rs6000/emmintrin.h =================================================================== --- gcc/config/rs6000/emmintrin.h (revision 259375) +++ gcc/config/rs6000/emmintrin.h (working copy) @@ -1488,7 +1488,7 @@ _mm_slli_epi16 (__m128i __A, int __B) __v8hu lshift; __v8hi result = { 0, 0, 0, 0, 0, 0, 0, 0 }; - if (__B < 16) + if (__B >= 0 && __B < 16) { if (__builtin_constant_p(__B)) lshift = (__v8hu) vec_splat_s16(__B); @@ -1507,12 +1507,12 @@ _mm_slli_epi32 (__m128i __A, int __B) __v4su lshift; __v4si result = { 0, 0, 0, 0 }; - if (__B < 32) + if (__B >= 0 && __B < 32) { - if (__builtin_constant_p(__B)) - lshift = (__v4su) vec_splat_s32(__B); + if (__builtin_constant_p(__B) && __B < 16) + lshift = (__v4su) vec_splat_s32(__B); else - lshift = vec_splats ((unsigned int) __B); + lshift = vec_splats ((unsigned int) __B); result = vec_vslw ((__v4si) __A, lshift); } @@ -1527,17 +1527,12 @@ _mm_slli_epi64 (__m128i __A, int __B) __v2du lshift; __v2di result = { 0, 0 }; - if (__B < 64) + if (__B >= 0 && __B < 64) { - if (__builtin_constant_p(__B)) - { - if (__B < 32) - lshift = (__v2du) vec_splat_s32(__B); - else - lshift = (__v2du) vec_splats((unsigned long long)__B); - } + if (__builtin_constant_p(__B) && __B < 16) + lshift = (__v2du) vec_splat_s32(__B); else - lshift = (__v2du) vec_splats ((unsigned int) __B); + lshift = (__v2du) vec_splats ((unsigned int) __B); result = vec_vsld ((__v2di) __A, lshift); } Index: gcc/testsuite/gcc.target/powerpc/sse2-pslld-1.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/sse2-pslld-1.c (revision 259375) +++ gcc/testsuite/gcc.target/powerpc/sse2-pslld-1.c (working copy) @@ -13,32 +13,50 @@ #define TEST sse2_test_pslld_1 #endif -#define N 0xf - #include -static __m128i -__attribute__((noinline, unused)) -test (__m128i s1) -{ - return _mm_slli_epi32 (s1, N); -} +#define TEST_FUNC(id, N) \ + static __m128i \ + __attribute__((noinline, unused)) \ + test##id (__m128i s1) \ + { \ + return _mm_slli_epi32 (s1, N); \ + } +TEST_FUNC(0, 0) +TEST_FUNC(15, 15) +TEST_FUNC(16, 16) +TEST_FUNC(31, 31) +TEST_FUNC(neg1, -1) +TEST_FUNC(neg16, -16) +TEST_FUNC(neg32, -32) +TEST_FUNC(neg64, -64) +TEST_FUNC(neg128, -128) + +#define TEST_CODE(id, N) \ + { \ + int e[4] = {0}; \ + union128i_d u, s; \ + int i; \ + s.x = _mm_set_epi32 (1, -2, 3, 4); \ + u.x = test##id (s.x); \ + if (N >= 0 && N < 32) \ + for (i = 0; i < 4; i++) \ + e[i] = s.a[i] << (N * (N >= 0)); \ + if (check_union128i_d (u, e)) \ + abort (); \ + } + static void TEST (void) { - union128i_d u, s; - int e[4] = {0}; - int i; - - s.x = _mm_set_epi32 (1, -2, 3, 4); - - u.x = test (s.x); - - if (N < 32) - for (i = 0; i < 4; i++) - e[i] = s.a[i] << N; - - if (check_union128i_d (u, e)) - abort (); + TEST_CODE(0, 0); + TEST_CODE(15, 15); + TEST_CODE(16, 16); + TEST_CODE(31, 31); + TEST_CODE(neg1, -1); + TEST_CODE(neg16, -16); + TEST_CODE(neg32, -32); + TEST_CODE(neg64, -64); + TEST_CODE(neg128, -128); } Index: gcc/testsuite/gcc.target/powerpc/sse2-psllq-1.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/sse2-psllq-1.c (revision 259375) +++ gcc/testsuite/gcc.target/powerpc/sse2-psllq-1.c (working copy) @@ -13,36 +13,56 @@ #define TEST sse2_test_psllq_1 #endif -#define N 60 - #include #ifdef _ARCH_PWR8 -static __m128i -__attribute__((noinline, unused)) -test (__m128i s1) -{ - return _mm_slli_epi64 (s1, N); -} +#define TEST_FUNC(id, N) \ + static __m128i \ + __attribute__((noinline, unused)) \ + test##id (__m128i s1) \ + { \ + return _mm_slli_epi64 (s1, N); \ + } + +TEST_FUNC(0, 0) +TEST_FUNC(15, 15) +TEST_FUNC(16, 16) +TEST_FUNC(31, 31) +TEST_FUNC(63, 63) +TEST_FUNC(neg1, -1) +TEST_FUNC(neg16, -16) +TEST_FUNC(neg32, -32) +TEST_FUNC(neg64, -64) +TEST_FUNC(neg128, -128) #endif +#define TEST_CODE(id, N) \ + { \ + union128i_q u, s; \ + long long e[2] = {0}; \ + int i; \ + s.x = _mm_set_epi64x (-1, 0xf); \ + u.x = test##id (s.x); \ + if (N >= 0 && N < 64) \ + for (i = 0; i < 2; i++) \ + e[i] = s.a[i] << (N * (N >= 0)); \ + if (check_union128i_q (u, e)) \ + abort (); \ + } + static void TEST (void) { #ifdef _ARCH_PWR8 - union128i_q u, s; - long long e[2] = {0}; - int i; - - s.x = _mm_set_epi64x (-1, 0xf); - - u.x = test (s.x); - - if (N < 64) - for (i = 0; i < 2; i++) - e[i] = s.a[i] << N; - - if (check_union128i_q (u, e)) - abort (); + TEST_CODE(0, 0); + TEST_CODE(15, 15); + TEST_CODE(16, 16); + TEST_CODE(31, 31); + TEST_CODE(63, 63); + TEST_CODE(neg1, -1); + TEST_CODE(neg16, -16); + TEST_CODE(neg32, -32); + TEST_CODE(neg64, -64); + TEST_CODE(neg128, -128); #endif } Index: gcc/testsuite/gcc.target/powerpc/sse2-psllw-1.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/sse2-psllw-1.c (revision 259375) +++ gcc/testsuite/gcc.target/powerpc/sse2-psllw-1.c (working copy) @@ -13,32 +13,48 @@ #define TEST sse2_test_psllw_1 #endif -#define N 0xb - #include -static __m128i -__attribute__((noinline, unused)) -test (__m128i s1) -{ - return _mm_slli_epi16 (s1, N); -} +#define TEST_FUNC(id, N) \ + static __m128i \ + __attribute__((noinline, unused)) \ + test##id (__m128i s1) \ + { \ + return _mm_slli_epi16 (s1, N); \ + } +TEST_FUNC(0, 0) +TEST_FUNC(15, 15) +TEST_FUNC(16, 16) +TEST_FUNC(neg1, -1) +TEST_FUNC(neg16, -16) +TEST_FUNC(neg32, -32) +TEST_FUNC(neg64, -64) +TEST_FUNC(neg128, -128) + +#define TEST_CODE(id, N) \ + { \ + short e[8] = {0}; \ + union128i_w u, s; \ + int i; \ + s.x = _mm_set_epi16 (1, 2, 3, 4, 5, 6, 0x7000, 0x9000); \ + u.x = test##id (s.x); \ + if (N >= 0 && N < 16) \ + for (i = 0; i < 8; i++) \ + e[i] = s.a[i] << (N * (N >= 0)); \ + if (check_union128i_w (u, e)) \ + abort (); \ + } + static void TEST (void) { - union128i_w u, s; - short e[8] = {0}; - int i; - - s.x = _mm_set_epi16 (1, 2, 3, 4, 5, 6, 0x7000, 0x9000); - - u.x = test (s.x); - - if (N < 16) - for (i = 0; i < 8; i++) - e[i] = s.a[i] << N; - - if (check_union128i_w (u, e)) - abort (); + TEST_CODE(0, 0); + TEST_CODE(15, 15); + TEST_CODE(16, 16); + TEST_CODE(neg1, -1); + TEST_CODE(neg16, -16); + TEST_CODE(neg32, -32); + TEST_CODE(neg64, -64); + TEST_CODE(neg128, -128); }