Message ID | 5769637F.8090603@foss.arm.com |
---|---|
State | New |
Headers | show |
On Tue, Jun 21, 2016 at 04:55:43PM +0100, Kyrill Tkachov wrote: > Hi all, > > This is a rebase of https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00403.html > on top of Evandro's changes. > Also, to elaborate on the original posting, the initial tuning structure is > based on the Cortex-A57 one but with the issue rate set to 2, FMA steering > turned off and ADRP+LDR fusion enabled. I see you've also chosen to use the generic_branch_cost costs for branches. As you didn't mention it explicitly here, was that intentional? > Is this ok for trunk? This looks OK to me. Watch out for the conflict with the Broadcom Vulcan patch that was committed to trunk earlier today. The merge should be easy. Thanks for the patch! James > 2016-06-21 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * config/aarch64/aarch64.c (cortexa73_tunings): New struct. > * config/aarch64/aarch64-cores.def (cortex-a73): New entry. > (cortex-a73.cortex-a35): Likewise. > (cortex-a73.cortex-a53): Likewise. > * config/aarch64/aarch64-tune.md: Regenerate. > * doc/invoke.texi (AArch64 Options): Document cortex-a73, > cortex-a73.cortex-a35 and cortex-a73.cortex-a53 arguments to > -mcpu and -mtune.
Hi James, On 21/06/16 17:38, James Greenhalgh wrote: > On Tue, Jun 21, 2016 at 04:55:43PM +0100, Kyrill Tkachov wrote: >> Hi all, >> >> This is a rebase of https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00403.html >> on top of Evandro's changes. >> Also, to elaborate on the original posting, the initial tuning structure is >> based on the Cortex-A57 one but with the issue rate set to 2, FMA steering >> turned off and ADRP+LDR fusion enabled. > I see you've also chosen to use the generic_branch_cost costs for > branches. As you didn't mention it explicitly here, was that intentional? > Ah, that was copied from the Cortex-a72 tuning. I didn't spend any time experimenting with it. generic_branch_costs should be good enough for the initial enablement. I can change it to cortexa57_branch_cost if you'd like. Or we can do it separately later (I suspect Cortex-A72 should use those costs too.) >> Is this ok for trunk? > This looks OK to me. Watch out for the conflict with the Broadcom Vulcan > patch that was committed to trunk earlier today. The merge should be easy. > > Thanks for the patch! Thanks, I'll rebase and commit it today. Kyrill > James > >> 2016-06-21 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >> >> * config/aarch64/aarch64.c (cortexa73_tunings): New struct. >> * config/aarch64/aarch64-cores.def (cortex-a73): New entry. >> (cortex-a73.cortex-a35): Likewise. >> (cortex-a73.cortex-a53): Likewise. >> * config/aarch64/aarch64-tune.md: Regenerate. >> * doc/invoke.texi (AArch64 Options): Document cortex-a73, >> cortex-a73.cortex-a35 and cortex-a73.cortex-a53 arguments to >> -mcpu and -mtune.
On Wed, Jun 22, 2016 at 09:12:25AM +0100, Kyrill Tkachov wrote: > Hi James, > > On 21/06/16 17:38, James Greenhalgh wrote: > >On Tue, Jun 21, 2016 at 04:55:43PM +0100, Kyrill Tkachov wrote: > >>Hi all, > >> > >>This is a rebase of https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00403.html > >>on top of Evandro's changes. > >>Also, to elaborate on the original posting, the initial tuning structure is > >>based on the Cortex-A57 one but with the issue rate set to 2, FMA steering > >>turned off and ADRP+LDR fusion enabled. > >I see you've also chosen to use the generic_branch_cost costs for > >branches. As you didn't mention it explicitly here, was that intentional? > > > > Ah, that was copied from the Cortex-a72 tuning. I didn't spend any time > experimenting with it. generic_branch_costs should be good enough for the > initial enablement. I can change it to cortexa57_branch_cost if you'd like. > Or we can do it separately later (I suspect Cortex-A72 should use those costs > too.) Yes, I'm more than happy for it to be a follow-up. We'll probably need to revisit the settings for a few of the cores once the if-convert cost model changes I've been working on go in anyway. Thanks again, James
diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def index 251a3ebb9be82def8f257cbdcab440d7a51d478b..3bbf42504c528fc364af19f422ff79dc0f8b7cd8 100644 --- a/gcc/config/aarch64/aarch64-cores.def +++ b/gcc/config/aarch64/aarch64-cores.def @@ -44,6 +44,7 @@ AARCH64_CORE("cortex-a35", cortexa35, cortexa53, 8A, AARCH64_FL_FOR_ARCH8 | AA AARCH64_CORE("cortex-a53", cortexa53, cortexa53, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa53, "0x41", "0xd03") AARCH64_CORE("cortex-a57", cortexa57, cortexa57, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, "0x41", "0xd07") AARCH64_CORE("cortex-a72", cortexa72, cortexa57, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa72, "0x41", "0xd08") +AARCH64_CORE("cortex-a73", cortexa73, cortexa57, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa73, "0x41", "0xd09") AARCH64_CORE("exynos-m1", exynosm1, exynosm1, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, exynosm1, "0x53", "0x001") AARCH64_CORE("qdf24xx", qdf24xx, cortexa57, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, cortexa57, "0x51", "0x800") AARCH64_CORE("thunderx", thunderx, thunderx, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx, "0x43", "0x0a1") @@ -53,4 +54,5 @@ AARCH64_CORE("xgene1", xgene1, xgene1, 8A, AARCH64_FL_FOR_ARCH8, xge AARCH64_CORE("cortex-a57.cortex-a53", cortexa57cortexa53, cortexa53, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, "0x41", "0xd07.0xd03") AARCH64_CORE("cortex-a72.cortex-a53", cortexa72cortexa53, cortexa53, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa72, "0x41", "0xd08.0xd03") - +AARCH64_CORE("cortex-a73.cortex-a35", cortexa73cortexa35, cortexa53, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa73, "0x41", "0xd09.0xd04") +AARCH64_CORE("cortex-a73.cortex-a53", cortexa73cortexa53, cortexa53, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa73, "0x41", "0xd09.0xd03") diff --git a/gcc/config/aarch64/aarch64-tune.md b/gcc/config/aarch64/aarch64-tune.md index cbc6f4879edb2f3842a50dfafe206313d49e9cf8..392dfbd0d922007b2d245d168ab5cf95db2670b5 100644 --- a/gcc/config/aarch64/aarch64-tune.md +++ b/gcc/config/aarch64/aarch64-tune.md @@ -1,5 +1,5 @@ ;; -*- buffer-read-only: t -*- ;; Generated automatically by gentune.sh from aarch64-cores.def (define_attr "tune" - "cortexa35,cortexa53,cortexa57,cortexa72,exynosm1,qdf24xx,thunderx,xgene1,cortexa57cortexa53,cortexa72cortexa53" + "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,exynosm1,qdf24xx,thunderx,xgene1,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53" (const (symbol_ref "((enum attr_tune) aarch64_tune)"))) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 78140653a8b09e5789afe670dc2c2f22a3a94a08..43eaa272dc54f231d3f31eea0fdc0b288b0d3f61 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -548,6 +548,32 @@ static const struct tune_params cortexa72_tunings = (AARCH64_EXTRA_TUNE_NONE) /* tune_flags. */ }; +static const struct tune_params cortexa73_tunings = +{ + &cortexa57_extra_costs, + &cortexa57_addrcost_table, + &cortexa57_regmove_cost, + &cortexa57_vector_cost, + &generic_branch_cost, + &generic_approx_modes, + 4, /* memmov_cost. */ + 2, /* issue_rate. */ + (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_MOV_MOVK | AARCH64_FUSE_ADRP_ADD + | AARCH64_FUSE_MOVK_MOVK | AARCH64_FUSE_ADRP_LDR), /* fusible_ops */ + 16, /* function_align. */ + 8, /* jump_align. */ + 4, /* loop_align. */ + 2, /* int_reassoc_width. */ + 4, /* fp_reassoc_width. */ + 1, /* vec_reassoc_width. */ + 2, /* min_div_recip_mul_sf. */ + 2, /* min_div_recip_mul_df. */ + 0, /* max_case_values. */ + 0, /* cache_line_size. */ + tune_params::AUTOPREFETCHER_WEAK, /* autoprefetcher_model. */ + (AARCH64_EXTRA_TUNE_NONE) /* tune_flags. */ +}; + static const struct tune_params exynosm1_tunings = { &exynosm1_extra_costs, diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index c05ce8f8f12f9419d68c9ab6ceb8d89310b6c077..92c34764fce31b6a6e59f740c1e2131692ac527c 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -13085,12 +13085,14 @@ processors implementing the target architecture. Specify the name of the target processor for which GCC should tune the performance of the code. Permissible values for this option are: @samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a57}, -@samp{cortex-a72}, @samp{exynos-m1}, @samp{qdf24xx}, @samp{thunderx}, -@samp{xgene1}, @samp{cortex-a57.cortex-a53}, @samp{cortex-a72.cortex-a53}, -@samp{native}. - -The values @samp{cortex-a57.cortex-a53}, @samp{cortex-a72.cortex-a53} -specify that GCC should tune for a big.LITTLE system. +@samp{cortex-a72}, @samp{cortex-a73}, @samp{exynos-m1}, @samp{qdf24xx}, +@samp{thunderx}, @samp{xgene1}, @samp{cortex-a57.cortex-a53}, +@samp{cortex-a72.cortex-a53}, @samp{cortex-a73.cortex-a35}, +@samp{cortex-a73.cortex-a53}, @samp{native}. + +The values @samp{cortex-a57.cortex-a53}, @samp{cortex-a72.cortex-a53}, +@samp{cortex-a73.cortex-a35}, @samp{cortex-a73.cortex-a53} specify that +GCC should tune for a big.LITTLE system. Additionally on native AArch64 GNU/Linux systems the value @samp{native} tunes performance to the host system. This option has no effect