From patchwork Wed Jun 1 23:27:22 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 628900 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3rKmk13hXDz9t5g for ; Thu, 2 Jun 2016 09:27:51 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=Fq3OsFXw; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; q=dns; s= default; b=ESItn4DrfKZYeERcXrTVcUy8omU5Xst16lnJyIbUuoVgIzhLU9uLv h4azT7g/O5zuszk3p486EIj6oeKieN+CVJI8ALpvjw9oJLul1wxr8Yx4W9jhibLE d+X/q4F89opEMQA5GDMThCcDcROr9QLqobV1CjzRPX9MEGXO6jfqvI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; s= default; bh=dhTh11daFpRZdWEApvIyv0hQdHg=; b=Fq3OsFXwOeIA30gK4uwK MFAC5S3RR22H0GPRvdhrIzgCEGP+CrrhxRYQVDqR3crb3Q9B8shPll40eojwXXKI AKITq6qybL0e0QxzgLqK7E9jWDaoepYl0wYWmDhr+Pem/95iL2UTxRUl+rHiccl0 uCGCByU9DDaKkQZ4h8XRb4c= Received: (qmail 80893 invoked by alias); 1 Jun 2016 23:27:40 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 80874 invoked by uid 89); 1 Jun 2016 23:27:40 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=1.9 required=5.0 tests=AWL, BAYES_50, KAM_ASCII_DIVIDERS, KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 spammy=978, 2506r, 8994797, littleton X-HELO: e19.ny.us.ibm.com Received: from e19.ny.us.ibm.com (HELO e19.ny.us.ibm.com) (129.33.205.209) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Wed, 01 Jun 2016 23:27:29 +0000 Received: from localhost by e19.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 1 Jun 2016 19:27:27 -0400 Received: from d01dlp03.pok.ibm.com (9.56.250.168) by e19.ny.us.ibm.com (146.89.104.206) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 1 Jun 2016 19:27:24 -0400 X-IBM-Helo: d01dlp03.pok.ibm.com X-IBM-MailFrom: meissner@ibm-tiger.the-meissners.org X-IBM-RcptTo: gcc-patches@gcc.gnu.org; kelvin@gcc.gnu.org; dje.gcc@gmail.com; segher@kernel.crashing.org Received: from b01cxnp23032.gho.pok.ibm.com (b01cxnp23032.gho.pok.ibm.com [9.57.198.27]) by d01dlp03.pok.ibm.com (Postfix) with ESMTP id 5E319C90042; Wed, 1 Jun 2016 19:27:16 -0400 (EDT) Received: from b01ledav004.gho.pok.ibm.com (b01ledav004.gho.pok.ibm.com [9.57.199.109]) by b01cxnp23032.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u51NROI329950072; Wed, 1 Jun 2016 23:27:24 GMT Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E301A11204B; Wed, 1 Jun 2016 19:27:23 -0400 (EDT) Received: from ibm-tiger.the-meissners.org (unknown [9.32.77.111]) by b01ledav004.gho.pok.ibm.com (Postfix) with ESMTP id BA298112040; Wed, 1 Jun 2016 19:27:23 -0400 (EDT) Received: by ibm-tiger.the-meissners.org (Postfix, from userid 500) id 2B02F43A4E; Wed, 1 Jun 2016 19:27:23 -0400 (EDT) Date: Wed, 1 Jun 2016 19:27:22 -0400 From: Michael Meissner To: gcc-patches@gcc.gnu.org, Segher Boessenkool , David Edelsohn , Bill Schmidt , Kelvin Nilsen Subject: [PATCH applied], Backport PowerPC ISA 3.0 xxperm, builtin, and vneg support to GCC 6.2 Message-ID: <20160601232722.GA9851@ibm-tiger.the-meissners.org> Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , David Edelsohn , Bill Schmidt , Kelvin Nilsen MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-12-10) X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16060123-0057-0000-0000-0000047D1AAF X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused X-IsSubscribed: yes I applied the following patches that were applied to the trunk to the GCC 6.2 branch: [gcc] 2016-06-01 Michael Meissner Back port from trunk 2016-05-23 Michael Meissner PR target/71201 * config/rs6000/altivec.md (altivec_vperm__internal): Drop ISA 3.0 xxperm fusion alternative. (altivec_vperm_v8hiv16qi): Likewise. (altivec_vperm__uns_internal): Likewise. (vperm_v8hiv4si): Likewise. (vperm_v16qiv8hi): Likewise. Back port from trunk 2016-05-23 Michael Meissner Kelvin Nilsen * config/rs6000/rs6000.c (rs6000_expand_vector_set): Generate vpermr/xxpermr on ISA 3.0. (altivec_expand_vec_perm_le): Likewise. * config/rs6000/altivec.md (UNSPEC_VPERMR): New unspec. (altivec_vpermr__internal): Add VPERMR/XXPERMR support for ISA 3.0. [gcc/testsuite] 2016-06-01 Michael Meissner Back port from trunk 2016-05-23 Michael Meissner Kelvin Nilsen * gcc.target/powerpc/p9-permute.c: Run test on big endian as well as little endian. Back port from trunk 2016-05-23 Michael Meissner Kelvin Nilsen * gcc.target/powerpc/p9-vpermr.c: New test for ISA 3.0 vpermr support. [gcc] 2016-06-01 Michael Meissner Back port from trunk 2016-05-24 Michael Meissner * config/rs6000/altivec.md (VParity): New mode iterator for vector parity built-in functions. (p9v_ctz2): Add support for ISA 3.0 vector count trailing zeros. (p9v_parity2): Likewise. * config/rs6000/vector.md (VEC_IP): New mode iterator for vector parity. (ctz2): ISA 3.0 expander for vector count trailing zeros. (parity2): ISA 3.0 expander for vector parity. * config/rs6000/rs6000-builtin.def (BU_P9_MISC_1): New macros for power9 built-ins. (BU_P9_64BIT_MISC_0): Likewise. (BU_P9_MISC_0): Likewise. (BU_P9V_AV_1): Likewise. (BU_P9V_AV_2): Likewise. (BU_P9V_AV_3): Likewise. (BU_P9V_AV_P): Likewise. (BU_P9V_VSX_1): Likewise. (BU_P9V_OVERLOAD_1): Likewise. (BU_P9V_OVERLOAD_2): Likewise. (BU_P9V_OVERLOAD_3): Likewise. (VCTZB): Add vector count trailing zeros support. (VCTZH): Likewise. (VCTZW): Likewise. (VCTZD): Likewise. (VPRTYBD): Add vector parity support. (VPRTYBQ): Likewise. (VPRTYBW): Likewise. (VCTZ): Add overloaded vector count trailing zeros support. (VPRTYB): Add overloaded vector parity support. * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add overloaded vector count trailing zeros and parity instructions. * config/rs6000/rs6000.md (wd mode attribute): Add V1TI and TI for vector parity support. * config/rs6000/altivec.h (vec_vctz): Add ISA 3.0 vector count trailing zeros support. (vec_cntlz): Likewise. (vec_vctzb): Likewise. (vec_vctzd): Likewise. (vec_vctzh): Likewise. (vec_vctzw): Likewise. (vec_vprtyb): Add ISA 3.0 vector parity support. (vec_vprtybd): Likewise. (vec_vprtybw): Likewise. (vec_vprtybq): Likewise. * doc/extend.texi (PowerPC AltiVec Built-in Functions): Document the ISA 3.0 vector count trailing zeros and vector parity built-in functions. [gcc/testsuite] 2016-06-01 Michael Meissner Back port from trunk 2016-05-24 Michael Meissner * gcc.target/powerpc/p9-vparity.c: New file to check ISA 3.0 vector parity built-in functions. * gcc.target/powerpc/ctz-3.c: New file to check ISA 3.0 vector count trailing zeros automatic vectorization. * gcc.target/powerpc/ctz-4.c: New file to check ISA 3.0 vector count trailing zeros built-in functions. [gcc] 2016-06-01 Michael Meissner Back port from trunk 2016-05-24 Michael Meissner * config/rs6000/altivec.md (VNEG iterator): New iterator for VNEGW/VNEGD instructions. (p9_neg2): New insns for ISA 3.0 VNEGW/VNEGD. (neg2): Add expander for V2DImode added in ISA 2.07, and support for ISA 3.0 VNEGW/VNEGD instructions. [gcc/testsuite] 2016-06-01 Michael Meissner Back port from trunk 2016-05-24 Michael Meissner * gcc.target/powerpc/p9-vneg.c: New test for ISA 3.0 VNEGW/VNEGD instructions. Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 236958) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -6853,21 +6853,29 @@ rs6000_expand_vector_set (rtx target, rt gen_rtvec (3, target, reg, force_reg (V16QImode, x)), UNSPEC_VPERM); - else + else { - /* Invert selector. We prefer to generate VNAND on P8 so - that future fusion opportunities can kick in, but must - generate VNOR elsewhere. */ - rtx notx = gen_rtx_NOT (V16QImode, force_reg (V16QImode, x)); - rtx iorx = (TARGET_P8_VECTOR - ? gen_rtx_IOR (V16QImode, notx, notx) - : gen_rtx_AND (V16QImode, notx, notx)); - rtx tmp = gen_reg_rtx (V16QImode); - emit_insn (gen_rtx_SET (tmp, iorx)); - - /* Permute with operands reversed and adjusted selector. */ - x = gen_rtx_UNSPEC (mode, gen_rtvec (3, reg, target, tmp), - UNSPEC_VPERM); + if (TARGET_P9_VECTOR) + x = gen_rtx_UNSPEC (mode, + gen_rtvec (3, target, reg, + force_reg (V16QImode, x)), + UNSPEC_VPERMR); + else + { + /* Invert selector. We prefer to generate VNAND on P8 so + that future fusion opportunities can kick in, but must + generate VNOR elsewhere. */ + rtx notx = gen_rtx_NOT (V16QImode, force_reg (V16QImode, x)); + rtx iorx = (TARGET_P8_VECTOR + ? gen_rtx_IOR (V16QImode, notx, notx) + : gen_rtx_AND (V16QImode, notx, notx)); + rtx tmp = gen_reg_rtx (V16QImode); + emit_insn (gen_rtx_SET (tmp, iorx)); + + /* Permute with operands reversed and adjusted selector. */ + x = gen_rtx_UNSPEC (mode, gen_rtvec (3, reg, target, tmp), + UNSPEC_VPERM); + } } emit_insn (gen_rtx_SET (target, x)); @@ -34128,17 +34136,25 @@ altivec_expand_vec_perm_le (rtx operands if (!REG_P (target)) tmp = gen_reg_rtx (mode); - /* Invert the selector with a VNAND if available, else a VNOR. - The VNAND is preferred for future fusion opportunities. */ - notx = gen_rtx_NOT (V16QImode, sel); - iorx = (TARGET_P8_VECTOR - ? gen_rtx_IOR (V16QImode, notx, notx) - : gen_rtx_AND (V16QImode, notx, notx)); - emit_insn (gen_rtx_SET (norreg, iorx)); + if (TARGET_P9_VECTOR) + { + unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op0, op1, sel), + UNSPEC_VPERMR); + } + else + { + /* Invert the selector with a VNAND if available, else a VNOR. + The VNAND is preferred for future fusion opportunities. */ + notx = gen_rtx_NOT (V16QImode, sel); + iorx = (TARGET_P8_VECTOR + ? gen_rtx_IOR (V16QImode, notx, notx) + : gen_rtx_AND (V16QImode, notx, notx)); + emit_insn (gen_rtx_SET (norreg, iorx)); - /* Permute with operands reversed and adjusted selector. */ - unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op1, op0, norreg), - UNSPEC_VPERM); + /* Permute with operands reversed and adjusted selector. */ + unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op1, op0, norreg), + UNSPEC_VPERM); + } /* Copy into target, possibly by way of a register. */ if (!REG_P (target)) Index: gcc/config/rs6000/altivec.md =================================================================== --- gcc/config/rs6000/altivec.md (revision 236941) +++ gcc/config/rs6000/altivec.md (working copy) @@ -58,6 +58,7 @@ (define_c_enum "unspec" UNSPEC_VSUM2SWS UNSPEC_VSUMSWS UNSPEC_VPERM + UNSPEC_VPERMR UNSPEC_VPERM_UNS UNSPEC_VRFIN UNSPEC_VCFUX @@ -1949,32 +1950,30 @@ (define_expand "altivec_vperm_" ;; Slightly prefer vperm, since the target does not overlap the source (define_insn "*altivec_vperm__internal" - [(set (match_operand:VM 0 "register_operand" "=v,?wo,?&wo") - (unspec:VM [(match_operand:VM 1 "register_operand" "v,0,wo") - (match_operand:VM 2 "register_operand" "v,wo,wo") - (match_operand:V16QI 3 "register_operand" "v,wo,wo")] + [(set (match_operand:VM 0 "register_operand" "=v,?wo") + (unspec:VM [(match_operand:VM 1 "register_operand" "v,0") + (match_operand:VM 2 "register_operand" "v,wo") + (match_operand:V16QI 3 "register_operand" "v,wo")] UNSPEC_VPERM))] "TARGET_ALTIVEC" "@ vperm %0,%1,%2,%3 - xxperm %x0,%x2,%x3 - xxlor %x0,%x1,%x1\t\t# xxperm fusion\;xxperm %x0,%x2,%x3" + xxperm %x0,%x2,%x3" [(set_attr "type" "vecperm") - (set_attr "length" "4,4,8")]) + (set_attr "length" "4")]) (define_insn "altivec_vperm_v8hiv16qi" - [(set (match_operand:V16QI 0 "register_operand" "=v,?wo,?&wo") - (unspec:V16QI [(match_operand:V8HI 1 "register_operand" "v,0,wo") - (match_operand:V8HI 2 "register_operand" "v,wo,wo") - (match_operand:V16QI 3 "register_operand" "v,wo,wo")] + [(set (match_operand:V16QI 0 "register_operand" "=v,?wo") + (unspec:V16QI [(match_operand:V8HI 1 "register_operand" "v,0") + (match_operand:V8HI 2 "register_operand" "v,wo") + (match_operand:V16QI 3 "register_operand" "v,wo")] UNSPEC_VPERM))] "TARGET_ALTIVEC" "@ vperm %0,%1,%2,%3 - xxperm %x0,%x2,%x3 - xxlor %x0,%x1,%x1\t\t# xxperm fusion\;xxperm %x0,%x2,%x3" + xxperm %x0,%x2,%x3" [(set_attr "type" "vecperm") - (set_attr "length" "4,4,8")]) + (set_attr "length" "4")]) (define_expand "altivec_vperm__uns" [(set (match_operand:VM 0 "register_operand" "") @@ -1992,18 +1991,17 @@ (define_expand "altivec_vperm__uns }) (define_insn "*altivec_vperm__uns_internal" - [(set (match_operand:VM 0 "register_operand" "=v,?wo,?&wo") - (unspec:VM [(match_operand:VM 1 "register_operand" "v,0,wo") - (match_operand:VM 2 "register_operand" "v,wo,wo") - (match_operand:V16QI 3 "register_operand" "v,wo,wo")] + [(set (match_operand:VM 0 "register_operand" "=v,?wo") + (unspec:VM [(match_operand:VM 1 "register_operand" "v,0") + (match_operand:VM 2 "register_operand" "v,wo") + (match_operand:V16QI 3 "register_operand" "v,wo")] UNSPEC_VPERM_UNS))] "TARGET_ALTIVEC" "@ vperm %0,%1,%2,%3 - xxperm %x0,%x2,%x3 - xxlor %x0,%x1,%x1\t\t# xxperm fusion\;xxperm %x0,%x2,%x3" + xxperm %x0,%x2,%x3" [(set_attr "type" "vecperm") - (set_attr "length" "4,4,8")]) + (set_attr "length" "4")]) (define_expand "vec_permv16qi" [(set (match_operand:V16QI 0 "register_operand" "") @@ -2032,6 +2030,19 @@ (define_expand "vec_perm_constv16qi" FAIL; }) +(define_insn "*altivec_vpermr__internal" + [(set (match_operand:VM 0 "register_operand" "=v,?wo") + (unspec:VM [(match_operand:VM 1 "register_operand" "v,0") + (match_operand:VM 2 "register_operand" "v,wo") + (match_operand:V16QI 3 "register_operand" "v,wo")] + UNSPEC_VPERMR))] + "TARGET_P9_VECTOR" + "@ + vpermr %0,%1,%2,%3 + xxpermr %x0,%x2,%x3" + [(set_attr "type" "vecperm") + (set_attr "length" "4")]) + (define_insn "altivec_vrfip" ; ceil [(set (match_operand:V4SF 0 "register_operand" "=v") (unspec:V4SF [(match_operand:V4SF 1 "register_operand" "v")] @@ -2791,32 +2802,30 @@ (define_expand "vec_unpacks_lo_ + +vector long long +permute (vector long long *p, vector long long *q, vector unsigned char mask) +{ + vector long long a = *p; + vector long long b = *q; + + /* Force a, b to be in altivec registers to select vpermr insn. */ + __asm__ (" # a: %x0, b: %x1" : "+v" (a), "+v" (b)); + + return vec_perm (a, b, mask); +} + +/* { dg-final { scan-assembler "vpermr\|xxpermr" } } */