From patchwork Fri Jun 6 19:02:17 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 356988 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 5D412140099 for ; Sat, 7 Jun 2014 05:02:35 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; q=dns; s= default; b=oetETeDGTwFwir66u53ixBFeYo37F2hiVUC+KfgnE0bYLMr65d4lE tDt5VPvRqS8hvZCVp4TVWC7TSKluvjSfPH2pbJpZxh9M3yi3+9XNN/deDnOHuEi4 LESeY3qmKlH385Owz/pMHu14wiEE4ZxxbCYVuMadCttBaTc8p/cex4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; s= default; bh=PsT3NUgwEWZ3ZC4lz9pLCohAM34=; b=Fona22RksZ+Q0g13WHf/ hnOef2hswkyrtv8L21z6Sl328K4DIFsAJ7efGV50DKi6UIqHyB+1lXKgjTbVw8gA 98wWZxatMoo743koCIe5Iz80TnaWzc1atzR74cpkFqVh7gtUtbKJfbCVTtyct4rc VQE/xpeIiYpT0V7+Z0by6jI= Received: (qmail 11455 invoked by alias); 6 Jun 2014 19:02:27 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 11440 invoked by uid 89); 6 Jun 2014 19:02:26 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.6 required=5.0 tests=AWL, BAYES_00 autolearn=ham version=3.3.2 X-HELO: e33.co.us.ibm.com Received: from e33.co.us.ibm.com (HELO e33.co.us.ibm.com) (32.97.110.151) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Fri, 06 Jun 2014 19:02:25 +0000 Received: from /spool/local by e33.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 6 Jun 2014 13:02:22 -0600 Received: from d03dlp02.boulder.ibm.com (9.17.202.178) by e33.co.us.ibm.com (192.168.1.133) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Fri, 6 Jun 2014 13:02:20 -0600 Received: from b03cxnp07029.gho.boulder.ibm.com (b03cxnp07029.gho.boulder.ibm.com [9.17.130.16]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id BB63D3E4003B for ; Fri, 6 Jun 2014 13:02:19 -0600 (MDT) Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by b03cxnp07029.gho.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id s56Gwtdj524738 for ; Fri, 6 Jun 2014 18:58:56 +0200 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id s56J2Jfq027682 for ; Fri, 6 Jun 2014 13:02:19 -0600 Received: from ibm-tiger.the-meissners.org (dhcp-9-32-77-206.usma.ibm.com [9.32.77.206]) by d03av04.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id s56J2Ijg027575; Fri, 6 Jun 2014 13:02:19 -0600 Received: by ibm-tiger.the-meissners.org (Postfix, from userid 500) id 082694296E; Fri, 6 Jun 2014 15:02:17 -0400 (EDT) Date: Fri, 6 Jun 2014 15:02:17 -0400 From: Michael Meissner To: gcc-patches@gcc.gnu.org, dje.gcc@gmail.com Subject: [PATCH], PowerPC PR 61431, Fix little endian word swapping for 128-bit types Message-ID: <20140606190217.GA8602@ibm-tiger.the-meissners.org> Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, dje.gcc@gmail.com MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-12-10) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14060619-0928-0000-0000-000002789D69 X-IsSubscribed: yes The pack01.c test fails on GCC 4.8 on little endian power8 systems. In looking at it, it is a bug where the V1TI memory operations do not have the word swapping define_split support. GCC 4.9 and trunk can optimize the union to stay in a register, so the test case passes on those systems, but it is still a bug that would be exposed if you ever need to store vector __int128 values. The test p8vector-int128-2.c is such a case, and it needs the fix. I've tested this via bootstrap and make check on power8 big and little endian systems, and there were no regressions. The test p8vector-int128-2.c now passes on the little endian power8 with this fix. Are these patches ok to check into the trunk, and 4.9/4.8 branches? 2014-06-06 Michael Meissner PR target/61431 * config/rs6000/vsx.md (VSX_LE): Split VSX_D into 2 separate iterators, VSX_D that handles 64-bit types, and VSX_LE that handles swapping the two 64-bit double words on little endian systems. Include V1TImode and optionally TImode in VSX_LE so that these types are properly swapped. Change all of the insns and splits that do the 64-bit swaps to use VSX_LE. (vsx_le_perm_load_): Likewise. (vsx_le_perm_store_): Likewise. (splitters for little endian memory operations): Likewise. (vsx_xxpermdi2_le_): Likewise. (vsx_lxvd2x2_le_): Likewise. (vsx_stxvd2x2_le_): Likewise. Index: gcc/config/rs6000/vsx.md =================================================================== --- gcc/config/rs6000/vsx.md (revision 211316) +++ gcc/config/rs6000/vsx.md (working copy) @@ -24,6 +24,13 @@ (define_mode_iterator VSX_B [DF V4SF V2D ;; Iterator for the 2 64-bit vector types (define_mode_iterator VSX_D [V2DF V2DI]) +;; Iterator for the 2 64-bit vector types + 128-bit types that are loaded with +;; lxvd2x to properly handle swapping words on little endian +(define_mode_iterator VSX_LE [V2DF + V2DI + V1TI + (TI "VECTOR_MEM_VSX_P (TImode)")]) + ;; Iterator for the 2 32-bit vector types (define_mode_iterator VSX_W [V4SF V4SI]) @@ -228,8 +235,8 @@ (define_c_enum "unspec" ;; The patterns for LE permuted loads and stores come before the general ;; VSX moves so they match first. (define_insn_and_split "*vsx_le_perm_load_" - [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wa") - (match_operand:VSX_D 1 "memory_operand" "Z"))] + [(set (match_operand:VSX_LE 0 "vsx_register_operand" "=wa") + (match_operand:VSX_LE 1 "memory_operand" "Z"))] "!BYTES_BIG_ENDIAN && TARGET_VSX" "#" "!BYTES_BIG_ENDIAN && TARGET_VSX" @@ -342,16 +349,16 @@ (define_insn_and_split "*vsx_le_perm_loa (set_attr "length" "8")]) (define_insn "*vsx_le_perm_store_" - [(set (match_operand:VSX_D 0 "memory_operand" "=Z") - (match_operand:VSX_D 1 "vsx_register_operand" "+wa"))] + [(set (match_operand:VSX_LE 0 "memory_operand" "=Z") + (match_operand:VSX_LE 1 "vsx_register_operand" "+wa"))] "!BYTES_BIG_ENDIAN && TARGET_VSX" "#" [(set_attr "type" "vecstore") (set_attr "length" "12")]) (define_split - [(set (match_operand:VSX_D 0 "memory_operand" "") - (match_operand:VSX_D 1 "vsx_register_operand" ""))] + [(set (match_operand:VSX_LE 0 "memory_operand" "") + (match_operand:VSX_LE 1 "vsx_register_operand" ""))] "!BYTES_BIG_ENDIAN && TARGET_VSX && !reload_completed" [(set (match_dup 2) (vec_select: @@ -369,8 +376,8 @@ (define_split ;; The post-reload split requires that we re-permute the source ;; register in case it is still live. (define_split - [(set (match_operand:VSX_D 0 "memory_operand" "") - (match_operand:VSX_D 1 "vsx_register_operand" ""))] + [(set (match_operand:VSX_LE 0 "memory_operand" "") + (match_operand:VSX_LE 1 "vsx_register_operand" ""))] "!BYTES_BIG_ENDIAN && TARGET_VSX && reload_completed" [(set (match_dup 1) (vec_select: @@ -1353,9 +1360,9 @@ (define_insn "vsx_concat_v2sf" ;; xxpermdi for little endian loads and stores. We need several of ;; these since the form of the PARALLEL differs by mode. (define_insn "*vsx_xxpermdi2_le_" - [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wa") - (vec_select:VSX_D - (match_operand:VSX_D 1 "vsx_register_operand" "wa") + [(set (match_operand:VSX_LE 0 "vsx_register_operand" "=wa") + (vec_select:VSX_LE + (match_operand:VSX_LE 1 "vsx_register_operand" "wa") (parallel [(const_int 1) (const_int 0)])))] "!BYTES_BIG_ENDIAN && VECTOR_MEM_VSX_P (mode)" "xxpermdi %x0,%x1,%x1,2" @@ -1402,9 +1409,9 @@ (define_insn "*vsx_xxpermdi16_le_V16QI" ;; lxvd2x for little endian loads. We need several of ;; these since the form of the PARALLEL differs by mode. (define_insn "*vsx_lxvd2x2_le_" - [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wa") - (vec_select:VSX_D - (match_operand:VSX_D 1 "memory_operand" "Z") + [(set (match_operand:VSX_LE 0 "vsx_register_operand" "=wa") + (vec_select:VSX_LE + (match_operand:VSX_LE 1 "memory_operand" "Z") (parallel [(const_int 1) (const_int 0)])))] "!BYTES_BIG_ENDIAN && VECTOR_MEM_VSX_P (mode)" "lxvd2x %x0,%y1" @@ -1451,9 +1458,9 @@ (define_insn "*vsx_lxvd2x16_le_V16QI" ;; stxvd2x for little endian stores. We need several of ;; these since the form of the PARALLEL differs by mode. (define_insn "*vsx_stxvd2x2_le_" - [(set (match_operand:VSX_D 0 "memory_operand" "=Z") - (vec_select:VSX_D - (match_operand:VSX_D 1 "vsx_register_operand" "wa") + [(set (match_operand:VSX_LE 0 "memory_operand" "=Z") + (vec_select:VSX_LE + (match_operand:VSX_LE 1 "vsx_register_operand" "wa") (parallel [(const_int 1) (const_int 0)])))] "!BYTES_BIG_ENDIAN && VECTOR_MEM_VSX_P (mode)" "stxvd2x %x1,%y0"