From patchwork Fri Jan 31 23:36:31 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bill Schmidt X-Patchwork-Id: 315857 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 5564C2C00A6 for ; Sat, 1 Feb 2014 10:36:45 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:subject:from:to:cc:date:content-type :content-transfer-encoding:mime-version; q=dns; s=default; b=qHv iv3Dz6qacVrkb/o4y8N9AMiLq71FCYDo9MbDllz2dGYVBLa95EaqbqUSFq3N/Xod M36i4rOOvmWz7rvKLFuYWFYXIwfxKNOsdZEevBzCmg/zeXvo0FOcvyDZ4G5YoE+r BqII3wDTLOyz1zzXfXUXc/F9nREwYkJY0ay19LnI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:subject:from:to:cc:date:content-type :content-transfer-encoding:mime-version; s=default; bh=9WNdD+Cn5 +ung/wBdDEa6R48GAQ=; b=sHc8nP1rtLRcWSHO9ULWPfQ1bU0RMKbxcQKvJIBPR +i09kJzfz93Gl2HvC86Fdahd8hpoQAacc2QAoIxDQgxNLG/95TMtQwYcsEOjlExe CTZzn+kq0pqOiM6aBGdLEI+8wGwXpmetwfS9dAi4zM7UhEyqwCsQBiX8xw0yKCoj hM= Received: (qmail 17705 invoked by alias); 31 Jan 2014 23:36:37 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 17690 invoked by uid 89); 31 Jan 2014 23:36:37 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-5.2 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: e23smtp07.au.ibm.com Received: from e23smtp07.au.ibm.com (HELO e23smtp07.au.ibm.com) (202.81.31.140) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Fri, 31 Jan 2014 23:36:35 +0000 Received: from /spool/local by e23smtp07.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sat, 1 Feb 2014 09:36:28 +1000 Received: from d23dlp03.au.ibm.com (202.81.31.214) by e23smtp07.au.ibm.com (202.81.31.204) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Sat, 1 Feb 2014 09:36:27 +1000 Received: from d23relay03.au.ibm.com (d23relay03.au.ibm.com [9.190.235.21]) by d23dlp03.au.ibm.com (Postfix) with ESMTP id C18F43578052 for ; Sat, 1 Feb 2014 10:36:26 +1100 (EST) Received: from d23av03.au.ibm.com (d23av03.au.ibm.com [9.190.234.97]) by d23relay03.au.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id s0VNaC0F11469128 for ; Sat, 1 Feb 2014 10:36:13 +1100 Received: from d23av03.au.ibm.com (localhost [127.0.0.1]) by d23av03.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id s0VNaPhm020711 for ; Sat, 1 Feb 2014 10:36:25 +1100 Received: from [9.77.137.133] (sig-9-77-137-133.mts.ibm.com [9.77.137.133]) by d23av03.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id s0VNaMsP020660; Sat, 1 Feb 2014 10:36:24 +1100 Message-ID: <1391211391.14633.19.camel@gnopaine> Subject: [PATCH, rs6000] Support -maltivec=be in LE mode for vec_perm builtin From: Bill Schmidt To: gcc-patches@gcc.gnu.org Cc: dje.gcc@gmail.com Date: Fri, 31 Jan 2014 17:36:31 -0600 Mime-Version: 1.0 X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14013123-0260-0000-0000-0000044B0242 X-IsSubscribed: yes Hi, One more patch in the set supporting -maltivec=be for the Altivec builtins, this one for vec_perm. Most of the work was done already in rs6000.c:altivec_expand_vec_perm_le (). We can reuse this logic for the present work by generalizing the code to operate on vector types other than V16QImode. When -maltivec=be is specified for a little endian target, we call that routine to expand the builtin rather than generating the vperm pattern directly. The pattern is then matched by *altivec_vperm_{,_uns}_internal to generate the actual vperm instruction. There are two new test cases that verify correct vec_perm behavior for BE, LE, and LE with -maltivec=be. Because vec_perm now works the same for BE and LE without -maltivec=be, one existing test needs modification to avoid the previous workaround for endianness. Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no regressions. Is this ok for trunk? Thanks, Bill gcc: 2014-01-31 Bill Schmidt * config/rs6000/rs6000.c (altivec_expand_vec_perm_le): Generalize for vector types other than V16QImode. * config/rs6000/altivec.md (altivec_vperm_): Change to a define_expand, and call altivec_expand_vec_perm_le when producing code with little endian element order. (*altivec_vperm__internal): New insn having previous behavior of altivec_vperm_. (altivec_vperm__uns): Change to a define_expand, and call altivec_expand_vec_perm_le when producing code with little endian element order. (*altivec_vperm__uns_internal): New insn having previous behavior of altivec_vperm__uns. gcc/testsuite: 2014-01-31 Bill Schmidt * gcc.dg/vmx/3b-15.c: Remove special handling for little endian. * gcc.dg/vmx/perm.c: New. * gcc.dg/vmx/perm-be-order.c: New. Index: gcc/testsuite/gcc.dg/vmx/perm-be-order.c =================================================================== --- gcc/testsuite/gcc.dg/vmx/perm-be-order.c (revision 0) +++ gcc/testsuite/gcc.dg/vmx/perm-be-order.c (revision 0) @@ -0,0 +1,74 @@ +/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mno-vsx" } */ + +#include "harness.h" + +static void test() +{ + /* Input vectors. */ + vector unsigned char vuca = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}; + vector unsigned char vucb = {16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31}; + vector signed char vsca = {-16,-15,-14,-13,-12,-11,-10,-9,-8,-7,-6,-5,-4,-3,-2,-1}; + vector signed char vscb = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}; + vector unsigned short vusa = {0,1,2,3,4,5,6,7}; + vector unsigned short vusb = {8,9,10,11,12,13,14,15}; + vector signed short vssa = {-8,-7,-6,-5,-4,-3,-2,-1}; + vector signed short vssb = {0,1,2,3,4,5,6,7}; + vector unsigned int vuia = {0,1,2,3}; + vector unsigned int vuib = {4,5,6,7}; + vector signed int vsia = {-4,-3,-2,-1}; + vector signed int vsib = {0,1,2,3}; + vector float vfa = {-4.0,-3.0,-2.0,-1.0}; + vector float vfb = {0.0,1.0,2.0,3.0}; + +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + vector unsigned char vucp = {15,16,14,17,13,18,12,19,11,20,10,21,9,22,8,23}; + vector unsigned char vscp = {15,16,14,17,13,18,12,19,11,20,10,21,9,22,8,23}; + vector unsigned char vusp = {15,14,17,16,13,12,19,18,11,10,21,20,9,8,23,22}; + vector unsigned char vssp = {15,14,17,16,13,12,19,18,11,10,21,20,9,8,23,22}; + vector unsigned char vuip = {15,14,13,12,19,18,17,16,11,10,9,8,23,22,21,20}; + vector unsigned char vsip = {15,14,13,12,19,18,17,16,11,10,9,8,23,22,21,20}; + vector unsigned char vfp = {15,14,13,12,19,18,17,16,11,10,9,8,23,22,21,20}; +#else + vector unsigned char vucp = {0,31,1,30,2,29,3,28,4,27,5,26,6,25,7,24}; + vector unsigned char vscp = {0,31,1,30,2,29,3,28,4,27,5,26,6,25,7,24}; + vector unsigned char vusp = {0,1,30,31,2,3,28,29,4,5,26,27,6,7,24,25}; + vector unsigned char vssp = {0,1,30,31,2,3,28,29,4,5,26,27,6,7,24,25}; + vector unsigned char vuip = {0,1,2,3,28,29,30,31,4,5,6,7,24,25,26,27}; + vector unsigned char vsip = {0,1,2,3,28,29,30,31,4,5,6,7,24,25,26,27}; + vector unsigned char vfp = {0,1,2,3,28,29,30,31,4,5,6,7,24,25,26,27}; +#endif + + /* Result vectors. */ + vector unsigned char vuc; + vector signed char vsc; + vector unsigned short vus; + vector signed short vss; + vector unsigned int vui; + vector signed int vsi; + vector float vf; + + /* Expected result vectors. */ + vector unsigned char vucr = {0,31,1,30,2,29,3,28,4,27,5,26,6,25,7,24}; + vector signed char vscr = {-16,15,-15,14,-14,13,-13,12,-12,11,-11,10,-10,9,-9,8}; + vector unsigned short vusr = {0,15,1,14,2,13,3,12}; + vector signed short vssr = {-8,7,-7,6,-6,5,-5,4}; + vector unsigned int vuir = {0,7,1,6}; + vector signed int vsir = {-4,3,-3,2}; + vector float vfr = {-4.0,3.0,-3.0,2.0}; + + vuc = vec_perm (vuca, vucb, vucp); + vsc = vec_perm (vsca, vscb, vscp); + vus = vec_perm (vusa, vusb, vusp); + vss = vec_perm (vssa, vssb, vssp); + vui = vec_perm (vuia, vuib, vuip); + vsi = vec_perm (vsia, vsib, vsip); + vf = vec_perm (vfa, vfb, vfp ); + + check (vec_all_eq (vuc, vucr), "vuc"); + check (vec_all_eq (vsc, vscr), "vsc"); + check (vec_all_eq (vus, vusr), "vus"); + check (vec_all_eq (vss, vssr), "vss"); + check (vec_all_eq (vui, vuir), "vui"); + check (vec_all_eq (vsi, vsir), "vsi"); + check (vec_all_eq (vf, vfr), "vf" ); +} Index: gcc/testsuite/gcc.dg/vmx/3b-15.c =================================================================== --- gcc/testsuite/gcc.dg/vmx/3b-15.c (revision 207373) +++ gcc/testsuite/gcc.dg/vmx/3b-15.c (working copy) @@ -3,11 +3,7 @@ vector unsigned char f (vector unsigned char a, vector unsigned char b, vector unsigned char c) { -#ifdef __BIG_ENDIAN__ return vec_perm(a,b,c); -#else - return vec_perm(b,a,c); -#endif } static void test() @@ -16,13 +12,8 @@ static void test() 8,9,10,11,12,13,14,15}), ((vector unsigned char){70,71,72,73,74,75,76,77, 78,79,80,81,82,83,84,85}), -#ifdef __BIG_ENDIAN__ ((vector unsigned char){0x1,0x14,0x18,0x10,0x16,0x15,0x19,0x1a, 0x1c,0x1c,0x1c,0x12,0x8,0x1d,0x1b,0xe})), -#else - ((vector unsigned char){0x1e,0xb,0x7,0xf,0x9,0xa,0x6,0x5, - 0x3,0x3,0x3,0xd,0x17,0x2,0x4,0x11})), -#endif ((vector unsigned char){1,74,78,70,76,75,79,80,82,82,82,72,8,83,81,14})), "f"); } Index: gcc/testsuite/gcc.dg/vmx/perm.c =================================================================== --- gcc/testsuite/gcc.dg/vmx/perm.c (revision 0) +++ gcc/testsuite/gcc.dg/vmx/perm.c (revision 0) @@ -0,0 +1,69 @@ +#include "harness.h" + +static void test() +{ + /* Input vectors. */ + vector unsigned char vuca = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}; + vector unsigned char vucb + = {16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31}; + vector unsigned char vucp = {0,31,1,30,2,29,3,28,4,27,5,26,6,25,7,24}; + + vector signed char vsca + = {-16,-15,-14,-13,-12,-11,-10,-9,-8,-7,-6,-5,-4,-3,-2,-1}; + vector signed char vscb = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}; + vector unsigned char vscp = {0,31,1,30,2,29,3,28,4,27,5,26,6,25,7,24}; + + vector unsigned short vusa = {0,1,2,3,4,5,6,7}; + vector unsigned short vusb = {8,9,10,11,12,13,14,15}; + vector unsigned char vusp = {0,1,30,31,2,3,28,29,4,5,26,27,6,7,24,25}; + + vector signed short vssa = {-8,-7,-6,-5,-4,-3,-2,-1}; + vector signed short vssb = {0,1,2,3,4,5,6,7}; + vector unsigned char vssp = {0,1,30,31,2,3,28,29,4,5,26,27,6,7,24,25}; + + vector unsigned int vuia = {0,1,2,3}; + vector unsigned int vuib = {4,5,6,7}; + vector unsigned char vuip = {0,1,2,3,28,29,30,31,4,5,6,7,24,25,26,27}; + + vector signed int vsia = {-4,-3,-2,-1}; + vector signed int vsib = {0,1,2,3}; + vector unsigned char vsip = {0,1,2,3,28,29,30,31,4,5,6,7,24,25,26,27}; + + vector float vfa = {-4.0,-3.0,-2.0,-1.0}; + vector float vfb = {0.0,1.0,2.0,3.0}; + vector unsigned char vfp = {0,1,2,3,28,29,30,31,4,5,6,7,24,25,26,27}; + + /* Result vectors. */ + vector unsigned char vuc; + vector signed char vsc; + vector unsigned short vus; + vector signed short vss; + vector unsigned int vui; + vector signed int vsi; + vector float vf; + + /* Expected result vectors. */ + vector unsigned char vucr = {0,31,1,30,2,29,3,28,4,27,5,26,6,25,7,24}; + vector signed char vscr = {-16,15,-15,14,-14,13,-13,12,-12,11,-11,10,-10,9,-9,8}; + vector unsigned short vusr = {0,15,1,14,2,13,3,12}; + vector signed short vssr = {-8,7,-7,6,-6,5,-5,4}; + vector unsigned int vuir = {0,7,1,6}; + vector signed int vsir = {-4,3,-3,2}; + vector float vfr = {-4.0,3.0,-3.0,2.0}; + + vuc = vec_perm (vuca, vucb, vucp); + vsc = vec_perm (vsca, vscb, vscp); + vus = vec_perm (vusa, vusb, vusp); + vss = vec_perm (vssa, vssb, vssp); + vui = vec_perm (vuia, vuib, vuip); + vsi = vec_perm (vsia, vsib, vsip); + vf = vec_perm (vfa, vfb, vfp ); + + check (vec_all_eq (vuc, vucr), "vuc"); + check (vec_all_eq (vsc, vscr), "vsc"); + check (vec_all_eq (vus, vusr), "vus"); + check (vec_all_eq (vss, vssr), "vss"); + check (vec_all_eq (vui, vuir), "vui"); + check (vec_all_eq (vsi, vsir), "vsi"); + check (vec_all_eq (vf, vfr), "vf" ); +} Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 207373) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -29840,16 +29840,18 @@ altivec_expand_vec_perm_le (rtx operands[4]) rtx op1 = operands[2]; rtx sel = operands[3]; rtx tmp = target; + rtx splatreg = gen_reg_rtx (V16QImode); + enum machine_mode mode = GET_MODE (target); /* Get everything in regs so the pattern matches. */ if (!REG_P (op0)) - op0 = force_reg (V16QImode, op0); + op0 = force_reg (mode, op0); if (!REG_P (op1)) - op1 = force_reg (V16QImode, op1); + op1 = force_reg (mode, op1); if (!REG_P (sel)) sel = force_reg (V16QImode, sel); if (!REG_P (target)) - tmp = gen_reg_rtx (V16QImode); + tmp = gen_reg_rtx (mode); /* SEL = splat(31) - SEL. */ /* We want to subtract from 31, but we can't vspltisb 31 since @@ -29857,13 +29859,12 @@ altivec_expand_vec_perm_le (rtx operands[4]) five bits of the permute control vector elements are used. */ splat = gen_rtx_VEC_DUPLICATE (V16QImode, gen_rtx_CONST_INT (QImode, -1)); - emit_move_insn (tmp, splat); - sel = gen_rtx_MINUS (V16QImode, tmp, sel); - emit_move_insn (tmp, sel); + emit_move_insn (splatreg, splat); + sel = gen_rtx_MINUS (V16QImode, splatreg, sel); + emit_move_insn (splatreg, sel); /* Permute with operands reversed and adjusted selector. */ - unspec = gen_rtx_UNSPEC (V16QImode, gen_rtvec (3, op1, op0, tmp), - UNSPEC_VPERM); + unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op1, op0, splatreg), UNSPEC_VPERM); /* Copy into target, possibly by way of a register. */ if (!REG_P (target)) Index: gcc/config/rs6000/altivec.md =================================================================== --- gcc/config/rs6000/altivec.md (revision 207373) +++ gcc/config/rs6000/altivec.md (working copy) @@ -1804,23 +1804,53 @@ "vrfiz %0,%1" [(set_attr "type" "vecfloat")]) -(define_insn "altivec_vperm_" +(define_expand "altivec_vperm_" [(set (match_operand:VM 0 "register_operand" "=v") (unspec:VM [(match_operand:VM 1 "register_operand" "v") (match_operand:VM 2 "register_operand" "v") (match_operand:V16QI 3 "register_operand" "v")] UNSPEC_VPERM))] "TARGET_ALTIVEC" +{ + if (!VECTOR_ELT_ORDER_BIG) + { + altivec_expand_vec_perm_le (operands); + DONE; + } +}) + +(define_insn "*altivec_vperm__internal" + [(set (match_operand:VM 0 "register_operand" "=v") + (unspec:VM [(match_operand:VM 1 "register_operand" "v") + (match_operand:VM 2 "register_operand" "v") + (match_operand:V16QI 3 "register_operand" "v")] + UNSPEC_VPERM))] + "TARGET_ALTIVEC" "vperm %0,%1,%2,%3" [(set_attr "type" "vecperm")]) -(define_insn "altivec_vperm__uns" +(define_expand "altivec_vperm__uns" [(set (match_operand:VM 0 "register_operand" "=v") (unspec:VM [(match_operand:VM 1 "register_operand" "v") (match_operand:VM 2 "register_operand" "v") (match_operand:V16QI 3 "register_operand" "v")] UNSPEC_VPERM_UNS))] "TARGET_ALTIVEC" +{ + if (!VECTOR_ELT_ORDER_BIG) + { + altivec_expand_vec_perm_le (operands); + DONE; + } +}) + +(define_insn "*altivec_vperm__uns_internal" + [(set (match_operand:VM 0 "register_operand" "=v") + (unspec:VM [(match_operand:VM 1 "register_operand" "v") + (match_operand:VM 2 "register_operand" "v") + (match_operand:V16QI 3 "register_operand" "v")] + UNSPEC_VPERM_UNS))] + "TARGET_ALTIVEC" "vperm %0,%1,%2,%3" [(set_attr "type" "vecperm")])