From patchwork Tue Feb 23 21:42:19 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Evandro Menezes X-Patchwork-Id: 587108 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 0EA9914032F for ; Wed, 24 Feb 2016 08:42:35 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=s4wF2a8v; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=lIcRE4CDJkRZbM7+L9CgHieh1oYbw8z4j4r3BGOFSaBVaWTKjk 4ZF8HOcgSXHKknOz4f2PMkISTeppAreTICBMYtE7e4rCrq/VqUhFhL/N6iWnq2JX KqDmv74/bfyS7azS9/CIhWvTgKOJVz7rWVtMRrMWY7r0E+Vr0QzDaPoPs= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; s= default; bh=bz0kSfkKAkXlKaN5pVUWzY5nz9Q=; b=s4wF2a8vn4QGECbyF/iy hzsY00MlRkhqB3KMOX2nOfPulhViQCI8aCEm7oyKYVxu1TY8ChvukoGz6sP9qyP7 dpH4nBkOuVagvGHtuMLZENqnB/TC+WDw2Knd3RafK+WYypcC/eRFd33c3KNXJ4Kv C1dRlXHORgciZkBSh4PoFIQ= Received: (qmail 73538 invoked by alias); 23 Feb 2016 21:42:27 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 73523 invoked by uid 89); 23 Feb 2016 21:42:26 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.5 required=5.0 tests=AWL, BAYES_50, KAM_LAZY_DOMAIN_SECURITY, RP_MATCHES_RCVD, T_MANY_HDRS_LCASE autolearn=no version=3.3.2 spammy=9476, sk:define_, 8927, addressing X-HELO: usmailout1.samsung.com Received: from mailout1.w2.samsung.com (HELO usmailout1.samsung.com) (211.189.100.11) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Tue, 23 Feb 2016 21:42:24 +0000 Received: from uscpsbgm2.samsung.com (u115.gpu85.samsung.co.kr [203.254.195.115]) by mailout1.w2.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTP id <0O3000F6PSALWY80@mailout1.w2.samsung.com> for gcc-patches@gcc.gnu.org; Tue, 23 Feb 2016 16:42:21 -0500 (EST) Received: from ussync4.samsung.com ( [203.254.195.84]) by uscpsbgm2.samsung.com (USCPMTA) with SMTP id 8E.FE.07641.D32DCC65; Tue, 23 Feb 2016 16:42:21 -0500 (EST) Received: from [172.31.207.192] ([105.140.31.209]) by ussync4.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTPA id <0O30009ZRSAKMN50@ussync4.samsung.com> for gcc-patches@gcc.gnu.org; Tue, 23 Feb 2016 16:42:21 -0500 (EST) To: GCC Patches From: Evandro Menezes Subject: [COMMITTED][AArch64] Tweak the pipeline model for Exynos M1 Message-id: <56CCD23B.5020108@samsung.com> Date: Tue, 23 Feb 2016 15:42:19 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-version: 1.0 Content-type: multipart/mixed; boundary=------------070808020607000705080102 X-IsSubscribed: yes Minor tweaks to the cost and scheduling models for Exynos M1. Committed as r233646 and r233647. From 01cadc5b883a2613f847aa7a88b86aed454d9413 Mon Sep 17 00:00:00 2001 From: evandro Date: Tue, 23 Feb 2016 21:31:00 +0000 Subject: [PATCH 2/2] Tweak the pipeline model for Exynos M1 gcc/ * config/aarch64/aarch64.c (exynosm1_tunings): Enable fusion of AES{D,E} and AESMC pairs. * config/arm/exynos-m1.md: Change cost of STP, fix bypass for stores and add bypass for AES{D,E} and AESMC pairs. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@233647 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog | 7 +++++++ gcc/config/aarch64/aarch64.c | 2 +- gcc/config/arm/exynos-m1.md | 26 +++++++++++++++++--------- 3 files changed, 25 insertions(+), 10 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 22dd022..07b50b5 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,5 +1,12 @@ 2016-02-23 Evandro Menezes + * config/arm/exynos-m1.md: Change cost of STP, fix bypass for stores + and add bypass for AES{D,E} and AESMC pairs. + * config/aarch64/aarch64.c (exynosm1_tunings): Enable fusion of AES{D,E} + and AESMC pairs. + +2016-02-23 Evandro Menezes + * config/aarch64/aarch64.c (exynosm1_tunings): Enable the Newton series for reciprocal square root in Exynos M1. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index dc3dfea..6dc8330 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -526,7 +526,7 @@ static const struct tune_params exynosm1_tunings = &generic_branch_cost, 4, /* memmov_cost */ 3, /* issue_rate */ - (AARCH64_FUSE_NOTHING), /* fusible_ops */ + (AARCH64_FUSE_AES_AESMC), /* fusible_ops */ 4, /* function_align. */ 4, /* jump_align. */ 4, /* loop_align. */ diff --git a/gcc/config/arm/exynos-m1.md b/gcc/config/arm/exynos-m1.md index 2f52b22..318b151 100644 --- a/gcc/config/arm/exynos-m1.md +++ b/gcc/config/arm/exynos-m1.md @@ -248,10 +248,6 @@ (eq_attr "type" "neon_load4_all_lanes, neon_load4_all_lanes_q") (const_string "neon_load4_all") - (eq_attr "type" "f_stores, f_stored,\ - neon_stp, neon_stp_q") - (const_string "neon_store") - (eq_attr "type" "neon_store1_1reg, neon_store1_1reg_q") (const_string "neon_store1_1") @@ -730,8 +726,14 @@ (define_insn_reservation "exynos_m1_neon_store" 1 (and (eq_attr "tune" "exynosm1") - (eq_attr "exynos_m1_neon_type" "neon_store")) - "(em1_fst, em1_st)") + (eq_attr "type" "f_stores, f_stored, neon_stp")) + "em1_sfst") + +(define_insn_reservation + "exynos_m1_neon_store_q" 3 + (and (eq_attr "tune" "exynosm1") + (eq_attr "type" "neon_stp_q")) + "(em1_sfst * 2)") (define_insn_reservation "exynos_m1_neon_store1_1" 1 @@ -761,7 +763,7 @@ "exynos_m1_neon_store1_one" 7 (and (eq_attr "tune" "exynosm1") (eq_attr "exynos_m1_neon_type" "neon_store1_one")) - "(em1_fst, em1_st)") + "em1_sfst") (define_insn_reservation "exynos_m1_neon_store2" 7 @@ -892,7 +894,9 @@ ;; Pre-decrement and post-increment addressing modes update the register quickly. ;; TODO: figure out how to tell the addressing mode register from the loaded one. -(define_bypass 1 "exynos_m1_store*" "exynos_m1_store*") +(define_bypass 1 "exynos_m1_store*, exynos_m1_neon_store*" + "exynos_m1_store*, exynos_m1_neon_store*, + exynos_m1_load*, exynos_m1_neon_load*") ;; MLAs can feed other MLAs quickly. (define_bypass 1 "exynos_m1_mla*" "exynos_m1_mla*") @@ -908,7 +912,6 @@ (define_bypass 5 "exynos_m1_neon_fp_mla, exynos_m1_neon_fp_step" "exynos_m1_neon_fp_add, exynos_m1_neon_fp_mul,\ exynos_m1_neon_fp_mla, exynos_m1_neon_fp_step") - (define_bypass 3 "exynos_m1_fp_add" "exynos_m1_fp_add, exynos_m1_fp_mul, exynos_m1_fp_mac") (define_bypass 3 "exynos_m1_neon_fp_add" @@ -947,6 +950,11 @@ "exynos_m1_crypto_simple, exynos_m1_crypto_complex,\ exynos_m1_crypto_poly*") +;; AES{D,E}/AESMC pairs can feed each other instantly. +(define_bypass 0 "exynos_m1_crypto_simple" + "exynos_m1_crypto_simple" + "aarch_crypto_can_dual_issue") + ;; Predicted branches take no time, but mispredicted ones take forever anyway. (define_bypass 1 "exynos_m1_*" "exynos_m1_call, exynos_m1_branch") -- 1.9.1