From patchwork Thu Nov 13 17:25:45 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyrylo Tkachov X-Patchwork-Id: 410511 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 7756C140079 for ; Fri, 14 Nov 2014 04:26:02 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; q= dns; s=default; b=w5OC7MbL8IqtWkiiHkmK+cyDqIBDUXtI3eYC9yAcK149tW yvKJ3Izjm84ambx1/1ND0FDFLks71OCrJBhdbLXicU050xr5hxLLTGhySDTOPC0H aQ4RPuJX3v3CDZb+orsP8VavJHITS031Pwk/rYAJRCiwnZSHzI/oRI783mUuQ= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; s= default; bh=HDvwD8XL2H/Ku9ujYGlD2boVh1c=; b=yleELOhFP1cXJk4e6SgU QkzOEv0ojMVN0CTeoeoH00ny0zkdVtc7ahDDrE+5/R0LD5toVvLl7/OQKevKRhzS LDtVRrcgg6mwBfuGmaPVfD3HocSsfG0EaTgkH/MHoD8GDhatoFmHy+2JOTLG0bbj OvzTfr9JIrf9A1G9innkB+0= Received: (qmail 1442 invoked by alias); 13 Nov 2014 17:25:54 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 1432 invoked by uid 89); 13 Nov 2014 17:25:53 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.8 required=5.0 tests=AWL, BAYES_00, LIKELY_SPAM_BODY, SPF_PASS autolearn=no version=3.3.2 X-HELO: service87.mimecast.com Received: from service87.mimecast.com (HELO service87.mimecast.com) (91.220.42.44) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 13 Nov 2014 17:25:49 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.21]) by service87.mimecast.com; Thu, 13 Nov 2014 17:25:46 +0000 Received: from [10.1.207.43] ([10.1.255.212]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Thu, 13 Nov 2014 17:25:46 +0000 Message-ID: <5464E999.60203@arm.com> Date: Thu, 13 Nov 2014 17:25:45 +0000 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: GCC Patches , Ramana Radhakrishnan , Richard Earnshaw Subject: [PATCH][ARM] Add Cortex-A17 support X-MC-Unique: 114111317254603101 X-IsSubscribed: yes Hi all, This patch adds support for the Cortex-A17 processor to the arm backend. Cortex-A17 is an ARMv7ve core with the same architectural features as the Cortex-A7, A12 and A15 cores. The -m{tune, cpu}=cortex-a17 option is added and a pipeline description for instruction scheduling is provided. This has given an uplift over -mcpu=cortex-a15 tuning on a number of benchmarks. The patch is fairly self-contained with the bulk of the diffstat being the pipeline description files. Bootstrapped and tested on arm-none-linux-gnueabihf. Ok for trunk? Thanks, Kyrill 2014-11-13 Kyrylo Tkachov * config/arm/arm.md (generic_sched): Specify cortexa17 in 'no' list. Include cortex-a17.md. * config/arm/arm.c (arm_issue_rate): Specify 2 for cortexa17. * config/arm/arm-cores.def (cortex-a17): New entry. * config/arm/arm-tables.opt: Regenerate. * config/arm/arm-tune.md: Regenerate. * config/arm/bpabi.h (BE8_LINK_SPEC): Specify mcpu=cortex-a17. * config/arm/cortex-a17.md: New file. * config/arm/cortex-a17-neon.md: New file. * config/arm/driver-arm.c (arm_cpu_table): Add entry for cortex-a17. * config/arm/t-aprofile: Add cortex-a17 entries to MULTILIB_MATCHES. commit 6f02016328e127c1c2c7491d7ad160c575f6d9ae Author: Kyrylo Tkachov Date: Tue Oct 21 12:29:07 2014 +0100 [ARM] Cortex-A17 tuning diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def index d5067b0..f8003ce 100644 --- a/gcc/config/arm/arm-cores.def +++ b/gcc/config/arm/arm-cores.def @@ -150,6 +150,7 @@ ARM_CORE("cortex-a8", cortexa8, cortexa8, 7A, FL_LDSCHED, cortex_a8) ARM_CORE("cortex-a9", cortexa9, cortexa9, 7A, FL_LDSCHED, cortex_a9) ARM_CORE("cortex-a12", cortexa12, cortexa15, 7A, FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a12) ARM_CORE("cortex-a15", cortexa15, cortexa15, 7A, FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a15) +ARM_CORE("cortex-a17", cortexa17, cortexa17, 7A, FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a12) ARM_CORE("cortex-r4", cortexr4, cortexr4, 7R, FL_LDSCHED, cortex) ARM_CORE("cortex-r4f", cortexr4f, cortexr4f, 7R, FL_LDSCHED, cortex) ARM_CORE("cortex-r5", cortexr5, cortexr5, 7R, FL_LDSCHED | FL_ARM_DIV, cortex) diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt index 9c7e944..9d8159f 100644 --- a/gcc/config/arm/arm-tables.opt +++ b/gcc/config/arm/arm-tables.opt @@ -271,6 +271,9 @@ EnumValue Enum(processor_type) String(cortex-a15) Value(cortexa15) EnumValue +Enum(processor_type) String(cortex-a17) Value(cortexa17) + +EnumValue Enum(processor_type) String(cortex-r4) Value(cortexr4) EnumValue diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md index 84355d6..7218542 100644 --- a/gcc/config/arm/arm-tune.md +++ b/gcc/config/arm/arm-tune.md @@ -28,7 +28,7 @@ (define_attr "tune" cortexm1smallmultiply,cortexm0smallmultiply,cortexm0plussmallmultiply, genericv7a,cortexa5,cortexa7, cortexa8,cortexa9,cortexa12, - cortexa15,cortexr4,cortexr4f, + cortexa15,cortexa17,cortexr4,cortexr4f, cortexr5,cortexr7,cortexm7, cortexm4,cortexm3,marvell_pj4, cortexa15cortexa7,cortexa53,cortexa57, diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index d2df286..6feb4b8 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -29965,6 +29965,7 @@ arm_issue_rate (void) case cortexa8: case cortexa9: case cortexa12: + case cortexa17: case cortexa53: case fa726te: case marvell_pj4: diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 3c2798e..52199a7 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -377,7 +377,7 @@ (define_attr "tune_cortexr4" "yes,no" (define_attr "generic_sched" "yes,no" (const (if_then_else - (ior (eq_attr "tune" "fa526,fa626,fa606te,fa626te,fmp626,fa726te,arm926ejs,arm1020e,arm1026ejs,arm1136js,arm1136jfs,cortexa5,cortexa7,cortexa8,cortexa9,cortexa12,cortexa15,cortexa53,cortexm4,marvell_pj4") + (ior (eq_attr "tune" "fa526,fa626,fa606te,fa626te,fmp626,fa726te,arm926ejs,arm1020e,arm1026ejs,arm1136js,arm1136jfs,cortexa5,cortexa7,cortexa8,cortexa9,cortexa12,cortexa15,cortexa17,cortexa53,cortexm4,marvell_pj4") (eq_attr "tune_cortexr4" "yes")) (const_string "no") (const_string "yes")))) @@ -406,6 +406,7 @@ (define_attr "generic_vfp" "yes,no" (include "cortex-a8.md") (include "cortex-a9.md") (include "cortex-a15.md") +(include "cortex-a17.md") (include "cortex-a53.md") (include "cortex-r4.md") (include "cortex-r4f.md") diff --git a/gcc/config/arm/bpabi.h b/gcc/config/arm/bpabi.h index f99e1af..22a37ae 100644 --- a/gcc/config/arm/bpabi.h +++ b/gcc/config/arm/bpabi.h @@ -64,7 +64,7 @@ " %{!mlittle-endian:%{march=armv7-a|mcpu=cortex-a5 \ |mcpu=cortex-a7 \ |mcpu=cortex-a8|mcpu=cortex-a9|mcpu=cortex-a15 \ - |mcpu=cortex-a12 \ + |mcpu=cortex-a12|mcpu=cortex-a17 \ |mcpu=cortex-a15.cortex-a7 \ |mcpu=marvell-pj4 \ |mcpu=cortex-a53 \ @@ -85,7 +85,7 @@ " %{mbig-endian:%{march=armv7-a|mcpu=cortex-a5 \ |mcpu=cortex-a7 \ |mcpu=cortex-a8|mcpu=cortex-a9|mcpu=cortex-a15 \ - |mcpu=cortex-a12 \ + |mcpu=cortex-a12|mcpu=cortex-a17 \ |mcpu=cortex-a15.cortex-a7 \ |mcpu=cortex-a53 \ |mcpu=cortex-a57 \ diff --git a/gcc/config/arm/cortex-a17-neon.md b/gcc/config/arm/cortex-a17-neon.md new file mode 100644 index 0000000..95bc372 --- /dev/null +++ b/gcc/config/arm/cortex-a17-neon.md @@ -0,0 +1,605 @@ +;; ARM Cortex-A17 NEON pipeline description +;; Copyright (C) 2014 Free Software Foundation, Inc. +;; +;; This file is part of GCC. +;; +;; GCC is free software; you can redistribute it and/or modify it +;; under the terms of the GNU General Public License as published by +;; the Free Software Foundation; either version 3, or (at your option) +;; any later version. +;; +;; GCC is distributed in the hope that it will be useful, but +;; WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +;; General Public License for more details. +;; +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; . + +(define_attr "cortex_a17_neon_type" + "neon_abd, neon_abd_q, neon_arith_acc, neon_arith_acc_q, + neon_arith_basic, neon_arith_complex, + neon_reduc_add_acc, neon_multiply, neon_multiply_q, + neon_multiply_long, neon_mla, neon_mla_q, neon_mla_long, + neon_sat_mla_long, neon_shift_acc, neon_shift_imm_basic,\ + neon_shift_imm_complex, + neon_shift_reg_basic, neon_shift_reg_basic_q, neon_shift_reg_complex, + neon_shift_reg_complex_q, neon_fp_negabs, neon_fp_arith, + neon_fp_arith_q, neon_fp_cvt_int, + neon_fp_cvt_int_q, neon_fp_cvt16, neon_fp_minmax, neon_fp_mul, + neon_fp_mul_q, neon_fp_mla, neon_fp_mla_q, neon_fp_recpe_rsqrte, + neon_fp_recpe_rsqrte_q, neon_bitops, neon_bitops_q, neon_from_gp, + neon_from_gp_q, neon_move, neon_tbl3_tbl4, neon_zip_q, neon_to_gp, + neon_load_a, neon_load_b, neon_load_c, neon_load_d, neon_load_e, + neon_load_f, neon_load_g, neon_load_h, neon_store_a, neon_store_b, + unknown" + (cond [ + (eq_attr "type" "neon_abd, neon_abd_long") + (const_string "neon_abd") + (eq_attr "type" "neon_abd_q") + (const_string "neon_abd_q") + (eq_attr "type" "neon_arith_acc, neon_reduc_add_acc,\ + neon_reduc_add_acc_q") + (const_string "neon_arith_acc") + (eq_attr "type" "neon_arith_acc_q") + (const_string "neon_arith_acc_q") + (eq_attr "type" "neon_add, neon_add_q, neon_add_long,\ + neon_add_widen, neon_neg, neon_neg_q,\ + neon_reduc_add, neon_reduc_add_q,\ + neon_reduc_add_long, neon_sub, neon_sub_q,\ + neon_sub_long, neon_sub_widen, neon_logic,\ + neon_logic_q, neon_tst, neon_tst_q") + (const_string "neon_arith_basic") + (eq_attr "type" "neon_abs, neon_abs_q, neon_add_halve_narrow_q,\ + neon_add_halve, neon_add_halve_q,\ + neon_sub_halve, neon_sub_halve_q, neon_qabs,\ + neon_qabs_q, neon_qadd, neon_qadd_q, neon_qneg,\ + neon_qneg_q, neon_qsub, neon_qsub_q,\ + neon_sub_halve_narrow_q,\ + neon_compare, neon_compare_q,\ + neon_compare_zero, neon_compare_zero_q,\ + neon_minmax, neon_minmax_q, neon_reduc_minmax,\ + neon_reduc_minmax_q") + (const_string "neon_arith_complex") + + (eq_attr "type" "neon_mul_b, neon_mul_h, neon_mul_s,\ + neon_mul_h_scalar, neon_mul_s_scalar,\ + neon_sat_mul_b, neon_sat_mul_h,\ + neon_sat_mul_s, neon_sat_mul_h_scalar,\ + neon_sat_mul_s_scalar,\ + neon_mul_b_long, neon_mul_h_long,\ + neon_mul_s_long,\ + neon_mul_h_scalar_long, neon_mul_s_scalar_long,\ + neon_sat_mul_b_long, neon_sat_mul_h_long,\ + neon_sat_mul_s_long, neon_sat_mul_h_scalar_long,\ + neon_sat_mul_s_scalar_long") + (const_string "neon_multiply") + (eq_attr "type" "neon_mul_b_q, neon_mul_h_q, neon_mul_s_q,\ + neon_mul_h_scalar_q, neon_mul_s_scalar_q,\ + neon_sat_mul_b_q, neon_sat_mul_h_q,\ + neon_sat_mul_s_q, neon_sat_mul_h_scalar_q,\ + neon_sat_mul_s_scalar_q") + (const_string "neon_multiply_q") + (eq_attr "type" "neon_mla_b, neon_mla_h, neon_mla_s,\ + neon_mla_h_scalar, neon_mla_s_scalar,\ + neon_mla_b_long, neon_mla_h_long,\ + neon_mla_s_long,\ + neon_mla_h_scalar_long, neon_mla_s_scalar_long") + (const_string "neon_mla") + (eq_attr "type" "neon_mla_b_q, neon_mla_h_q, neon_mla_s_q,\ + neon_mla_h_scalar_q, neon_mla_s_scalar_q") + (const_string "neon_mla_q") + (eq_attr "type" "neon_sat_mla_b_long, neon_sat_mla_h_long,\ + neon_sat_mla_s_long, neon_sat_mla_h_scalar_long,\ + neon_sat_mla_s_scalar_long") + (const_string "neon_sat_mla_long") + + (eq_attr "type" "neon_shift_acc, neon_shift_acc_q") + (const_string "neon_shift_acc") + (eq_attr "type" "neon_shift_imm, neon_shift_imm_q,\ + neon_shift_imm_narrow_q, neon_shift_imm_long") + (const_string "neon_shift_imm_basic") + (eq_attr "type" "neon_sat_shift_imm, neon_sat_shift_imm_q,\ + neon_sat_shift_imm_narrow_q") + (const_string "neon_shift_imm_complex") + (eq_attr "type" "neon_shift_reg") + (const_string "neon_shift_reg_basic") + (eq_attr "type" "neon_shift_reg_q") + (const_string "neon_shift_reg_basic_q") + (eq_attr "type" "neon_sat_shift_reg") + (const_string "neon_shift_reg_complex") + (eq_attr "type" "neon_sat_shift_reg_q") + (const_string "neon_shift_reg_complex_q") + + (eq_attr "type" "neon_fp_neg_s, neon_fp_neg_s_q,\ + neon_fp_abs_s, neon_fp_abs_s_q") + (const_string "neon_fp_negabs") + (eq_attr "type" "neon_fp_addsub_s, neon_fp_abd_s,\ + neon_fp_reduc_add_s, neon_fp_compare_s,\ + neon_fp_minmax_s, neon_fp_minmax_s_q,\ + neon_fp_reduc_minmax_s, neon_fp_round_s,\ + neon_fp_round_s_q, neon_fp_round_d,\ + neon_fp_round_d_q, neon_fp_reduc_minmax_s_q") + (const_string "neon_fp_arith") + (eq_attr "type" "neon_fp_addsub_s_q, neon_fp_abd_s_q,\ + neon_fp_reduc_add_s_q, neon_fp_compare_s_q") + (const_string "neon_fp_arith_q") + (eq_attr "type" "neon_fp_to_int_s, neon_int_to_fp_s") + (const_string "neon_fp_cvt_int") + (eq_attr "type" "neon_fp_to_int_s_q, neon_int_to_fp_s_q") + (const_string "neon_fp_cvt_int_q") + (eq_attr "type" "neon_fp_cvt_narrow_s_q, neon_fp_cvt_widen_h") + (const_string "neon_fp_cvt16") + (eq_attr "type" "neon_fp_mul_s, neon_fp_mul_s_scalar") + (const_string "neon_fp_mul") + (eq_attr "type" "neon_fp_mul_s_q, neon_fp_mul_s_scalar_q") + (const_string "neon_fp_mul_q") + (eq_attr "type" "neon_fp_mla_s, neon_fp_mla_s_scalar") + (const_string "neon_fp_mla") + (eq_attr "type" "neon_fp_mla_s_q, neon_fp_mla_s_scalar_q") + (const_string "neon_fp_mla_q") + (eq_attr "type" "neon_fp_recpe_s, neon_fp_rsqrte_s") + (const_string "neon_fp_recpe_rsqrte") + (eq_attr "type" "neon_fp_recpe_s_q, neon_fp_rsqrte_s_q") + (const_string "neon_fp_recpe_rsqrte_q") + + (eq_attr "type" "neon_bsl, neon_cls, neon_cnt,\ + neon_rev, neon_permute,\ + neon_tbl1, neon_tbl2, neon_zip,\ + neon_dup, neon_dup_q, neon_ext, neon_ext_q,\ + neon_move, neon_move_q, neon_move_narrow_q") + (const_string "neon_bitops") + (eq_attr "type" "neon_bsl_q, neon_cls_q, neon_cnt_q,\ + neon_rev_q, neon_permute_q") + (const_string "neon_bitops_q") + (eq_attr "type" "neon_from_gp") + (const_string "neon_from_gp") + (eq_attr "type" "neon_from_gp_q") + (const_string "neon_from_gp_q") + (eq_attr "type" "neon_tbl3, neon_tbl4") + (const_string "neon_tbl3_tbl4") + (eq_attr "type" "neon_zip_q") + (const_string "neon_zip_q") + (eq_attr "type" "neon_to_gp, neon_to_gp_q") + (const_string "neon_to_gp") + + (eq_attr "type" "neon_load1_1reg, neon_load1_1reg_q,\ + neon_load1_one_lane, neon_load1_one_lane_q") + (const_string "neon_load_a") + + (eq_attr "type" "neon_load1_2reg, neon_load1_2reg_q") + (const_string "neon_load_b") + + (eq_attr "type" "neon_load1_3reg, neon_load1_3reg_q,\ + neon_load1_all_lanes,neon_load1_all_lanes_q,\ + neon_load2_one_lane, neon_load2_one_lane_q,\ + neon_load2_all_lanes, neon_load2_all_lanes_q") + (const_string "neon_load_c") + + (eq_attr "type" "neon_load1_4reg, neon_load1_4reg_q,\ + neon_load2_2reg, neon_load2_2reg_q") + (const_string "neon_load_d") + + (eq_attr "type" "neon_load3_one_lane,\ + neon_load3_all_lanes,\ + neon_load4_one_lane, neon_load4_all_lanes") + (const_string "neon_load_e") + + + (eq_attr "type" "neon_load3_one_lane_q,\ + neon_load3_all_lanes_q,\ + neon_load4_one_lane_q, neon_load4_all_lanes_q") + (const_string "neon_load_f") + + (eq_attr "type" "neon_load3_3reg,neon_load3_3reg_q") + (const_string "neon_load_g") + + (eq_attr "type" "neon_load2_4reg,neon_load2_4reg_q,\ + neon_load4_4reg,neon_load4_4reg_q") + (const_string "neon_load_h") + + (eq_attr "type" "neon_store1_1reg, neon_store1_1reg_q,\ + neon_store1_2reg, neon_store1_2reg_q,\ + neon_store1_3reg, neon_store1_3reg_q,\ + neon_store1_4reg, neon_store1_4reg_q,\ + neon_store1_one_lane, neon_store1_one_lane_q,\ + neon_store2_2reg, neon_store2_2reg_q,\ + neon_store3_one_lane, neon_store3_one_lane_q,\ + neon_store4_one_lane, neon_store4_one_lane_q") + (const_string "neon_store_a") + + (eq_attr "type" "neon_store2_4reg, neon_store2_4reg_q,\ + neon_store2_one_lane, neon_store2_one_lane_q,\ + neon_store3_3reg, neon_store3_3reg_q,\ + neon_store4_4reg, neon_store4_4reg_q") + (const_string "neon_store_b") +] + (const_string "unknown"))) + +(define_automaton "cortex_a17_neon") + +(define_cpu_unit "ca17_asimd0, ca17_asimd1" "cortex_a17_neon") +(define_cpu_unit "ca17_fdiv0,ca17_simdfpadd0, ca17_simdfpmul0" "cortex_a17_neon") +(define_cpu_unit "ca17_simdimac0, ca17_simdialu0, ca17_perm0" "cortex_a17_neon") + +(define_cpu_unit "ca17_simdialu1, ca17_perm1, ca17_simdshift1" "cortex_a17_neon") +(define_cpu_unit "ca17_iacc1" "cortex_a17_neon") +(define_cpu_unit "ca17_fpmul1, ca17_fpadd1" "cortex_a17_neon") + + +;; Integer Arithmetic Instructions. + +(define_insn_reservation "cortex_a17_neon_abd" 5 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_abd")) + "(ca17_asimd0+ca17_simdialu0) | (ca17_asimd1+ca17_simdialu1)") + +(define_insn_reservation "cortex_a17_neon_abd_q" 5 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_abd_q")) + "ca17_asimd0+ca17_asimd1+ca17_simdialu0+ca17_simdialu1") + +(define_insn_reservation "cortex_a17_neon_aba" 7 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_arith_acc")) + "ca17_asimd1+ca17_simdialu1, ca17_iacc1") + +(define_insn_reservation "cortex_a17_neon_aba_q" 8 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_arith_acc_q")) + "ca17_asimd0+ca17_asimd1+ca17_simdialu0+ca17_simdialu1, ca17_iacc1*2") + +(define_insn_reservation "cortex_a17_neon_arith_basic" 4 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_arith_basic")) + "(ca17_asimd0+ca17_simdialu0) | (ca17_asimd1+ca17_simdialu1)") + +(define_insn_reservation "cortex_a17_neon_arith_complex" 5 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_arith_complex")) + "(ca17_asimd0+ca17_simdialu0) | (ca17_asimd1+ca17_simdialu1)") + +;; Integer Multiply Instructions. + +(define_insn_reservation "cortex_a17_neon_multiply" 6 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_multiply")) + "ca17_asimd0+ca17_simdimac0") + +(define_insn_reservation "cortex_a17_neon_multiply_q" 7 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_multiply_q")) + "(ca17_asimd0+ca17_simdimac0)*2") + +(define_insn_reservation "cortex_a17_neon_mla" 6 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_mla")) + "ca17_asimd0+ca17_simdimac0*2") + +(define_insn_reservation "cortex_a17_neon_mla_q" 7 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_mla_q")) + "(ca17_asimd0+ca17_simdimac0)*2,ca17_simdimac0") + +(define_insn_reservation "cortex_a17_neon_sat_mla_long" 6 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_sat_mla_long")) + "ca17_asimd0+ca17_simdimac0*2") + +;; Integer Shift Instructions. + +(define_insn_reservation + "cortex_a17_neon_shift_acc" 7 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_shift_acc")) + "ca17_asimd1+ca17_simdshift1,ca17_iacc1") + +(define_insn_reservation + "cortex_a17_neon_shift_imm_basic" 4 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_shift_imm_basic")) + "ca17_asimd1+ca17_simdshift1") + +(define_insn_reservation + "cortex_a17_neon_shift_imm_complex" 5 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_shift_imm_complex")) + "ca17_asimd1+ca17_simdshift1") + +(define_insn_reservation + "cortex_a17_neon_shift_reg_basic" 4 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_shift_reg_basic")) + "ca17_asimd1+ca17_simdshift1") + +(define_insn_reservation + "cortex_a17_neon_shift_reg_basic_q" 5 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_shift_reg_basic_q")) + "(ca17_asimd1+ca17_simdshift1)*2") + +(define_insn_reservation + "cortex_a17_neon_shift_reg_complex" 5 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_shift_reg_complex")) + "ca17_asimd1+ca17_simdshift1") + +(define_insn_reservation + "cortex_a17_neon_shift_reg_complex_q" 6 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_shift_reg_complex_q")) + "(ca17_asimd1+ca17_simdshift1)*2") + +(define_insn_reservation + "cortex_a17_neon_fp_negabs" 4 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_fp_negabs")) + "ca17_asimd0+ca17_simdfpadd0") + +(define_insn_reservation + "cortex_a17_neon_fp_arith" 6 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_fp_arith")) + "ca17_asimd0+ca17_simdfpadd0") + +(define_insn_reservation + "cortex_a17_neon_fp_arith_q" 6 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_fp_arith_q")) + "(ca17_asimd0+ca17_simdfpadd0)*2") + +(define_insn_reservation + "cortex_a17_neon_fp_cvt_int" 6 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_fp_cvt_int")) + "ca17_asimd0+ca17_simdfpadd0") + +(define_insn_reservation + "cortex_a17_neon_fp_cvt_int_q" 6 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_fp_cvt_int_q")) + "(ca17_asimd0+ca17_simdfpadd0)*2") + +(define_insn_reservation + "cortex_a17_neon_fp_cvt16" 10 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_fp_cvt16")) + "ca17_asimd0+ca17_simdfpadd0") + +(define_insn_reservation + "cortex_a17_neon_fp_mul" 5 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_fp_mul")) + "ca17_asimd0+ca17_simdfpmul0") + +(define_insn_reservation + "cortex_a17_neon_fp_mul_q" 5 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_fp_mul_q")) + "(ca17_asimd0+ca17_simdfpmul0)*2") + +(define_insn_reservation + "cortex_a17_neon_fp_mla" 8 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_fp_mla")) + "ca17_asimd0+ca17_simdfpmul0,ca17_simdfpadd0") + +(define_insn_reservation + "cortex_a17_neon_fp_mla_q" 9 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_fp_mla_q")) + "ca17_asimd0+ca17_simdfpmul0,ca17_asimd0+ca17_simdfpadd0+ca17_simdfpmul0,ca17_simdfpadd0") + +(define_insn_reservation + "cortex_a17_neon_fp_recps_rsqrte" 9 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_fp_recpe_rsqrte")) + "(ca17_asimd0+ca17_perm0)|(ca17_asimd1+ca17_perm1)") + +(define_insn_reservation + "cortex_a17_neon_fp_recps_rsqrte_q" 9 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_fp_recpe_rsqrte_q")) + "(ca17_asimd0+ca17_perm0)*2|(ca17_asimd1+ca17_perm1)*2") + +;; Miscelaneous Instructions. + +(define_insn_reservation + "cortex_a17_neon_bitops" 4 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_bitops")) + "(ca17_asimd0+ca17_perm0) | (ca17_asimd1+ca17_perm1)") + +(define_insn_reservation + "cortex_a17_neon_bitops_q" 4 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_bitops_q")) + "(ca17_asimd0+ca17_perm0)*2 | (ca17_asimd1+ca17_perm1)*2") + +(define_insn_reservation + "cortex_a17_neon_from_gp" 2 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_from_gp")) + "(ca17_asimd0+ca17_perm0)|(ca17_asimd1+ca17_perm1)") + +(define_insn_reservation + "cortex_a17_neon_from_gp_q" 3 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_from_gp_q")) + "(ca17_asimd0+ca17_perm0)|(ca17_asimd1+ca17_perm1)") + +(define_insn_reservation + "cortex_a17_neon_tbl3_tbl4" 7 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_tbl3_tbl4")) + "(ca17_asimd0+ca17_perm0)|(ca17_asimd1+ca17_perm1)") + +(define_insn_reservation + "cortex_a17_neon_zip_q" 7 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_zip_q")) + "(ca17_asimd0+ca17_perm0)|(ca17_asimd1+ca17_perm1)") + +(define_insn_reservation + "cortex_a17_neon_to_gp" 2 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_to_gp")) + "ca17_asimd0+ca17_perm0*3") + +(define_insn_reservation + "cortex_a17_vfp_flag" 5 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "f_flag")) + "ca17_asimd0+ca17_perm0") + +;; Load Instructions. + +(define_insn_reservation + "cortex_a17_vfp_load" 5 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "f_loads, f_loadd")) + "ca17_ls0|ca17_ls1") + +(define_insn_reservation + "cortex_a17_neon_load_a" 6 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_load_a")) + "ca17_ls0*2|ca17_ls1*2") + +(define_insn_reservation + "cortex_a17_neon_load_b" 7 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_load_b")) + "ca17_ls0*2|ca17_ls1*2") + +(define_insn_reservation + "cortex_a17_neon_load_c" 8 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_load_c")) + "ca17_ls0*2|ca17_ls1*2") + +(define_insn_reservation + "cortex_a17_neon_load_d" 9 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_load_d")) + "ca17_ls0*2|ca17_ls1*2") + +(define_insn_reservation + "cortex_a17_neon_load_e" 9 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_load_e")) + "ca17_ls0*2|ca17_ls1*2") + +(define_insn_reservation + "cortex_a17_neon_load_f" 10 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_load_f")) + "ca17_ls0*2+ca17_ls1*2") + +(define_insn_reservation + "cortex_a17_neon_load_g" 10 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_load_g")) + "ca17_ls0*2+ca17_ls1*2") + +(define_insn_reservation + "cortex_a17_neon_load_h" 11 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_load_h")) + "ca17_ls0*2+ca17_ls1*2") + +;; Store Instructions. + +(define_insn_reservation + "cortex_a17_vfp_store" 0 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "f_stores, f_stored")) + "ca17_ls0|ca17_ls1") + + +(define_insn_reservation + "cortex_a17_neon_store_a" 0 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_store_a")) + "ca17_ls0*2|ca17_ls1*2") + +(define_insn_reservation + "cortex_a17_neon_store_b" 0 + (and (eq_attr "tune" "cortexa17") + (eq_attr "cortex_a17_neon_type" "neon_store_b")) + "ca17_ls0*2+ca17_ls1*2") + +;; VFP Operations. + +(define_insn_reservation "cortex_a17_vfp_const" 4 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "fconsts,fconstd")) + "ca17_asimd1+ca17_fpadd1") + +(define_insn_reservation "cortex_a17_vfp_adds_subs" 6 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "fadds")) + "ca17_asimd1+ca17_fpadd1") + + +(define_insn_reservation "cortex_a17_vfp_addd_subd" 6 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "faddd")) + "ca17_asimd1+ca17_fpadd1") + +(define_insn_reservation "cortex_a17_vfp_mul" 6 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "fmuls,fmuld")) + "ca17_asimd1+ca17_fpmul1") + +(define_insn_reservation "cortex_a17_vfp_mac" 11 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "fmacs,ffmas,fmacd,ffmad")) + "ca17_asimd1+ca17_fpmul1,ca17_fpadd1") + +(define_insn_reservation "cortex_a17_vfp_cvt" 6 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "f_cvt,f_cvtf2i,f_cvti2f,f_rints,f_rintd")) + "ca17_asimd1+ca17_fpadd1") + +(define_insn_reservation "cortex_a17_vfp_cmp" 4 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "fcmps,fcmpd")) + "ca17_asimd1+ca17_fpadd1") + +(define_insn_reservation "cortex_a17_vfp_arithd" 4 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "ffarithd")) + "ca17_asimd1+ca17_fpadd1") + +(define_insn_reservation "cortex_a17_vfp_cpys" 4 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "fmov,fcsel")) + "ca17_asimd1+ca17_fpadd1") + +(define_insn_reservation "cortex_a17_gp_to_vfp" 2 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "f_mcr, f_mcrr")) + "(ca17_asimd0+ca17_perm0)|(ca17_asimd1+ca17_perm1)") + +(define_insn_reservation "cortex_a17_mov_vfp_to_gp" 4 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "f_mrc, f_mrrc")) + "ca17_asimd0+ca17_perm0*3") + +(define_insn_reservation "cortex_a17_vfp_ariths" 4 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "ffariths")) + "ca17_asimd1+ca17_fpadd1") + +(define_insn_reservation "cortex_a17_vfp_divs" 18 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "fdivs, fsqrts")) + "ca17_asimd0+ca17_fdiv0*10") + +(define_insn_reservation "cortex_a17_vfp_divd" 32 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "fdivd, fsqrtd")) + "ca17_asimd0+ca17_fdiv0*10") + diff --git a/gcc/config/arm/cortex-a17.md b/gcc/config/arm/cortex-a17.md new file mode 100644 index 0000000..9ee8ce8 --- /dev/null +++ b/gcc/config/arm/cortex-a17.md @@ -0,0 +1,169 @@ +;; ARM Cortex-A17 pipeline description +;; Copyright (C) 2014 Free Software Foundation, Inc. +;; +;; Contributed by ARM Ltd. +;; +;; This file is part of GCC. +;; +;; GCC is free software; you can redistribute it and/or modify it +;; under the terms of the GNU General Public License as published by +;; the Free Software Foundation; either version 3, or (at your option) +;; any later version. +;; +;; GCC is distributed in the hope that it will be useful, but +;; WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +;; General Public License for more details. +;; +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; . + + +(define_automaton "cortex_a17") + +(define_cpu_unit "ca17_ls0, ca17_ls1" "cortex_a17") +(define_cpu_unit "ca17_alu0, ca17_alu1" "cortex_a17") +(define_cpu_unit "ca17_mac" "cortex_a17") +(define_cpu_unit "ca17_idiv" "cortex_a17") +(define_cpu_unit "ca17_bx" "cortex_a17") + +(define_reservation "ca17_alu" "(ca17_alu0|ca17_alu1)") + + + +;; Simple Execution Unit: +;; +;; Simple ALU +(define_insn_reservation "cortex_a17_alu" 1 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "alu_imm,alus_imm,logic_imm,logics_imm,\ + alu_sreg,alus_sreg,logic_reg,logics_reg,\ + adc_imm,adcs_imm,adc_reg,adcs_reg,\ + adr, mov_imm,mov_reg,\ + mvn_imm,mvn_reg,extend,\ + mrs,multiple,no_insn")) + "ca17_alu") + +(define_insn_reservation "cortex_a17_alu_shiftimm" 2 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "bfm,clz,rev,rbit, alu_shift_imm, alus_shift_imm, + logic_shift_imm,alu_dsp_reg, logics_shift_imm,shift_imm,\ + shift_reg, mov_shift,mvn_shift")) + "ca17_alu") + + +;; ALU ops with register controlled shift. +(define_insn_reservation "cortex_a17_alu_shift_reg" 2 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "alu_shift_reg,alus_shift_reg,\ + logic_shift_reg,logics_shift_reg")) + "ca17_alu0") + + +;; Multiply Execution Unit: + +;; 32-bit multiplies +(define_insn_reservation "cortex_a17_mult32" 4 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "mul,muls,smmul,smmulr")) + "ca17_alu0+ca17_mac") + +(define_insn_reservation "cortex_a17_mac32" 4 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "mla,mlas,smmla")) + "ca17_alu0+ca17_mac,ca17_mac") + +(define_insn_reservation "cortex_a17_mac32_other" 3 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "smlad,smladx,smlsd,smlsdx,smuad,smuadx,smusd,smusdx")) + "ca17_alu0+ca17_mac,ca17_mac") + +;; 64-bit multiplies +(define_insn_reservation "cortex_a17_mac64" 4 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "smlal,smlals,umaal,umlal,umlals")) + "ca17_alu0+ca17_mac,ca17_mac") + +(define_insn_reservation "cortex_a17_mac64_other" 3 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "smlald,smlalxy,smlsld")) + "ca17_alu0+ca17_mac,ca17_mac") + +(define_insn_reservation "cortex_a17_mult64" 4 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "smull,smulls,umull,umulls")) + "ca17_alu0+ca17_mac,ca17_mac") + + +(define_bypass 2 "cortex_a17_mult*, cortex_a17_mac*" + "cortex_a17_mult*, cortex_a17_mac*" + "arm_mac_accumulator_is_result") + +;; Integer divide +(define_insn_reservation "cortex_a17_udiv" 19 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "udiv")) + "ca17_alu1+ca17_idiv*10") + +(define_insn_reservation "cortex_a17_sdiv" 20 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "sdiv")) + "ca17_alu1+ca17_idiv*11") + + + +;; Branch execution Unit +;; +;; Branches take one issue slot. +;; No latency as there is no result +(define_insn_reservation "cortex_a17_branch" 0 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "branch")) + "ca17_bx") + +;; Load-store execution Unit +;; +;; Loads of up to two words. +(define_insn_reservation "cortex_a17_load1" 4 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "load_byte,load1,load2")) + "ca17_ls0|ca17_ls1") + +;; Loads of three words. +(define_insn_reservation "cortex_a17_load3" 4 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "load3")) + "ca17_ls0+ca17_ls1") + +;; Loads of four words. +(define_insn_reservation "cortex_a17_load4" 4 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "load4")) + "ca17_ls0+ca17_ls1") + +;; Stores of up to two words. +(define_insn_reservation "cortex_a17_store1" 0 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "store1,store2")) + "ca17_ls0|ca17_ls1") + +;; Stores of three words +(define_insn_reservation "cortex_a17_store3" 0 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "store3")) + "ca17_ls0+ca17_ls1") + +;; Stores of four words. +(define_insn_reservation "cortex_a17_store4" 0 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "store4")) + "ca17_ls0+ca17_ls1") + +(define_insn_reservation "cortex_a17_call" 0 + (and (eq_attr "tune" "cortexa17") + (eq_attr "type" "call")) + "ca17_bx") + + +(include "../arm/cortex-a17-neon.md") diff --git a/gcc/config/arm/driver-arm.c b/gcc/config/arm/driver-arm.c index 6d9c417..bdaf48a 100644 --- a/gcc/config/arm/driver-arm.c +++ b/gcc/config/arm/driver-arm.c @@ -41,6 +41,7 @@ static struct vendor_cpu arm_cpu_table[] = { {"0xc08", "armv7-a", "cortex-a8"}, {"0xc09", "armv7-a", "cortex-a9"}, {"0xc0d", "armv7ve", "cortex-a12"}, + {"0xc0e", "armv7ve", "cortex-a17"}, {"0xc0f", "armv7ve", "cortex-a15"}, {"0xc14", "armv7-r", "cortex-r4"}, {"0xc15", "armv7-r", "cortex-r5"}, diff --git a/gcc/config/arm/t-aprofile b/gcc/config/arm/t-aprofile index ff9e2e1..475aed1 100644 --- a/gcc/config/arm/t-aprofile +++ b/gcc/config/arm/t-aprofile @@ -83,6 +83,7 @@ MULTILIB_MATCHES += march?armv7-a=mcpu?cortex-a9 MULTILIB_MATCHES += march?armv7-a=mcpu?cortex-a5 MULTILIB_MATCHES += march?armv7ve=mcpu?cortex-a15 MULTILIB_MATCHES += march?armv7ve=mcpu?cortex-a12 +MULTILIB_MATCHES += march?armv7ve=mcpu?cortex-a17 MULTILIB_MATCHES += march?armv7ve=mcpu?cortex-a15.cortex-a7 MULTILIB_MATCHES += march?armv8-a=mcpu?cortex-a53 MULTILIB_MATCHES += march?armv8-a=mcpu?cortex-a57