From patchwork Fri Aug 19 23:59:39 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 661055 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3sGKhp46y6z9sXR for ; Sat, 20 Aug 2016 10:00:08 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=YTq+m+FF; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:references:mime-version:content-type :in-reply-to:message-id; q=dns; s=default; b=aAnIskafwX03zim1uJZ Ih3la64CG4D4zN4DRaBtJRl/Jn8GYe7B36cpuCW1ONAfYchv0uAqBFC9XfqE4t/z 36foWyDMaaoa0pV3tYccJ+LUrsq+JpGUV4Cj6rUpvH1Iyx8JGwYSNktH3bWNsoYo uXWbLfE+mRMTpT1g8G0qKkY4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:references:mime-version:content-type :in-reply-to:message-id; s=default; bh=8uBWVd/q8ujgCPvpe8IpnI1Ec pY=; b=YTq+m+FFLqmJfdTia/yelBO0tHw8Xsz1QZw2msXD92wQCrTSjSR5DaPkK M9Z99RIiyUvLrZOquFR9kod4sDh+Dr4kMVrJo4iTeXGVCXzOYAMpNrOODyMkcka1 FKtHgBuu0Cmhw08yF+weWV3cCaH1AM0MmFrdFf2IRv93mnMosU= Received: (qmail 119498 invoked by alias); 19 Aug 2016 23:59:57 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 119488 invoked by uid 89); 19 Aug 2016 23:59:57 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=2.2 required=5.0 tests=AWL, BAYES_50, KAM_ASCII_DIVIDERS, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_LOW, RCVD_IN_SEMBACKSCATTER autolearn=no version=3.3.2 spammy=King, gen_reg_rtx, force_reg, get_mode_size X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 19 Aug 2016 23:59:47 +0000 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.11/8.16.0.11) with SMTP id u7JNwc7w015293 for ; Fri, 19 Aug 2016 19:59:45 -0400 Received: from e19.ny.us.ibm.com (e19.ny.us.ibm.com [129.33.205.209]) by mx0a-001b2d01.pphosted.com with ESMTP id 24wpdegfh1-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Fri, 19 Aug 2016 19:59:45 -0400 Received: from localhost by e19.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 19 Aug 2016 19:59:43 -0400 Received: from d01dlp03.pok.ibm.com (9.56.250.168) by e19.ny.us.ibm.com (146.89.104.206) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Fri, 19 Aug 2016 19:59:41 -0400 X-IBM-Helo: d01dlp03.pok.ibm.com X-IBM-MailFrom: meissner@ibm-tiger.the-meissners.org Received: from b01cxnp23033.gho.pok.ibm.com (b01cxnp23033.gho.pok.ibm.com [9.57.198.28]) by d01dlp03.pok.ibm.com (Postfix) with ESMTP id D3C11C90042; Fri, 19 Aug 2016 19:59:28 -0400 (EDT) Received: from b01ledav002.gho.pok.ibm.com (b01ledav002.gho.pok.ibm.com [9.57.199.107]) by b01cxnp23033.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u7JNxgsf16581040; Fri, 19 Aug 2016 23:59:42 GMT Received: from b01ledav002.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 513AD12403D; Fri, 19 Aug 2016 19:59:40 -0400 (EDT) Received: from ibm-tiger.the-meissners.org (unknown [9.32.77.111]) by b01ledav002.gho.pok.ibm.com (Postfix) with ESMTP id 355A3124037; Fri, 19 Aug 2016 19:59:40 -0400 (EDT) Received: by ibm-tiger.the-meissners.org (Postfix, from userid 500) id 9485C43A07; Fri, 19 Aug 2016 19:59:39 -0400 (EDT) Date: Fri, 19 Aug 2016 19:59:39 -0400 From: Michael Meissner To: Michael Meissner , Segher Boessenkool , gcc-patches@gcc.gnu.org, David Edelsohn , Bill Schmidt Subject: [PATCH], Patch #6, Improve vector short/char splat initialization on PowerPC Mail-Followup-To: Michael Meissner , Segher Boessenkool , gcc-patches@gcc.gnu.org, David Edelsohn , Bill Schmidt References: <20160804043344.GA8391@ibm-tiger.the-meissners.org> <20160804150336.GA22744@gate.crashing.org> <20160808225520.GA29239@ibm-tiger.the-meissners.org> <20160811231517.GA2148@ibm-tiger.the-meissners.org> <20160819221754.GA28759@ibm-tiger.the-meissners.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20160819221754.GA28759@ibm-tiger.the-meissners.org> User-Agent: Mutt/1.5.20 (2009-12-10) X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16081923-0056-0000-0000-0000011BE9BA X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00005617; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000183; SDB=6.00747087; UDB=6.00352308; IPR=6.00519609; BA=6.00004671; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00012399; XFM=3.00000011; UTC=2016-08-19 23:59:42 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16081923-0057-0000-0000-00000535EBE7 Message-Id: <20160819235939.GA21010@ibm-tiger.the-meissners.org> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2016-08-19_08:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1604210000 definitions=main-1608190297 X-IsSubscribed: yes This patch is a follow up to patch #5. It adds the support to use the Altivec VSPLTB/VSPLTH instructions if you are creating a vector char or vector short where each element is the same (but not constant) on 64-bit systems with direct move. The patch has been part of the larger set of patches for vector initialization that I've been testing for awhile. Most of those patches were submitted in patch #5, and in this patch (#6). There are a few patches remaining that cause a 4% performance degradation in the zeusmp benchmark (everything else with the larger set of patches is about the same performance). I built and ran zeusmp, and these particular patches do not cause the degradation. I will submit a full run over the weekend just to be sure. I tested these patches on a big endian Power8 system and a little endian Power8 system, and previous versions have run on a big endian Power7 system. There were no regressions caused by these patches. Can I install these patches in the GCC 7 trunk after the patches in patch #5 are installed? [gcc] 2016-08-19 Michael Meissner * config/rs6000/rs6000.c (rs6000_expand_vector_init): Add support for using VSPLTH/VSPLTB to initialize vector short and vector char vectors with all of the same element. * config/rs6000/vsx.md (VSX_SPLAT_I): New mode iterators and attributes to initialize V8HImode and V16QImode vectors with the same element. (VSX_SPLAT_COUNT): Likewise. (VSX_SPLAT_SUFFIX): Likewise. (vsx_vsplt_di): New insns to support initializing V8HImode and V16QImode vectors with the same element. Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 239627) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -6827,6 +6827,32 @@ rs6000_expand_vector_init (rtx target, r return; } + /* Special case initializing vector short/char that are splats if we are on + 64-bit systems with direct move. */ + if (all_same && TARGET_DIRECT_MOVE_64BIT + && (mode == V16QImode || mode == V8HImode)) + { + rtx op0 = XVECEXP (vals, 0, 0); + rtx di_tmp = gen_reg_rtx (DImode); + + if (!REG_P (op0)) + op0 = force_reg (GET_MODE_INNER (mode), op0); + + if (mode == V16QImode) + { + emit_insn (gen_zero_extendqidi2 (di_tmp, op0)); + emit_insn (gen_vsx_vspltb_di (target, di_tmp)); + return; + } + + if (mode == V8HImode) + { + emit_insn (gen_zero_extendhidi2 (di_tmp, op0)); + emit_insn (gen_vsx_vsplth_di (target, di_tmp)); + return; + } + } + /* Store value to stack temp. Load vector element. Splat. However, splat of 64-bit items is not supported on Altivec. */ if (all_same && GET_MODE_SIZE (inner_mode) <= 4) Index: gcc/config/rs6000/vsx.md =================================================================== --- gcc/config/rs6000/vsx.md (revision 239588) +++ gcc/config/rs6000/vsx.md (working copy) @@ -281,6 +281,16 @@ (define_mode_attr VSX_EX [(V16QI "v") (V8HI "v") (V4SI "wa")]) +;; Iterator for the 2 short vector types to do a splat from an integer +(define_mode_iterator VSX_SPLAT_I [V16QI V8HI]) + +;; Mode attribute to give the count for the splat instruction to splat +;; the value in the 64-bit integer slot +(define_mode_attr VSX_SPLAT_COUNT [(V16QI "7") (V8HI "3")]) + +;; Mode attribute to give the suffix for the splat instruction +(define_mode_attr VSX_SPLAT_SUFFIX [(V16QI "b") (V8HI "h")]) + ;; Constants for creating unspecs (define_c_enum "unspec" [UNSPEC_VSX_CONCAT @@ -2766,6 +2776,16 @@ (define_insn "vsx_xxspltw__direct" "xxspltw %x0,%x1,%2" [(set_attr "type" "vecperm")]) +;; V16QI/V8HI splat support on ISA 2.07 +(define_insn "vsx_vsplt_di" + [(set (match_operand:VSX_SPLAT_I 0 "altivec_register_operand" "=v") + (vec_duplicate:VSX_SPLAT_I + (truncate: + (match_operand:DI 1 "altivec_register_operand" "v"))))] + "VECTOR_MEM_VSX_P (mode) && TARGET_DIRECT_MOVE_64BIT" + "vsplt %0,%1," + [(set_attr "type" "vecperm")]) + ;; V2DF/V2DI splat for use by vec_splat builtin (define_insn "vsx_xxspltd_" [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wa")