Message ID | EB4625145972F94C9680D8CADD6516152A36BE71@sausexdag04.amd.com |
---|---|
State | New |
Headers | show |
On Thu, Sep 27, 2012 at 10:30 AM, Gopalasubramanian, Ganesh <Ganesh.Gopalasubramanian@amd.com> wrote: > This is a fix for PR 51109. > > There are three changes > > 1. Microcoded instructions are considered as single issue instructions and are therefore issued to a separate execution unit. > 2. The multiplier unit is attached to execution unit 1 (ieu1). Since ieu is handled as a separate automaton in the patch, separate "mult" automaton is not required. > 3. The integer execution units (2AGUs and 2EXs) are now decoupled. Now, they are described as separate automatons. > > Is it OK for upstream? > > Regards > Ganesh > > 2012-09-27 Ganesh Gopalasubramanian <Ganesh.Gopalasubramanian@amd.com> > > PR 51109 > * gcc/config/i386/bdver1.md (bdver1_int): Automaton has been > split to reduce state transitions. OK for mainline, if tested according to [1]. [1] http://gcc.gnu.org/contribute.html#testing Thanks, Uros.
Testing was done before posting the patch. It was successful. Regards Ganesh -----Original Message----- From: Uros Bizjak [mailto:ubizjak@gmail.com] Sent: Thursday, September 27, 2012 5:57 PM To: Gopalasubramanian, Ganesh Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH, i386]: Fix PR51109, symbol size in scheduler state machine is reduced On Thu, Sep 27, 2012 at 10:30 AM, Gopalasubramanian, Ganesh <Ganesh.Gopalasubramanian@amd.com> wrote: > This is a fix for PR 51109. > > There are three changes > > 1. Microcoded instructions are considered as single issue instructions and are therefore issued to a separate execution unit. > 2. The multiplier unit is attached to execution unit 1 (ieu1). Since ieu is handled as a separate automaton in the patch, separate "mult" automaton is not required. > 3. The integer execution units (2AGUs and 2EXs) are now decoupled. Now, they are described as separate automatons. > > Is it OK for upstream? > > Regards > Ganesh > > 2012-09-27 Ganesh Gopalasubramanian > <Ganesh.Gopalasubramanian@amd.com> > > PR 51109 > * gcc/config/i386/bdver1.md (bdver1_int): Automaton has been > split to reduce state transitions. OK for mainline, if tested according to [1]. [1] http://gcc.gnu.org/contribute.html#testing Thanks, Uros.
Hi,
On 10/03/2012 11:57 AM, Gopalasubramanian, Ganesh wrote:
> Testing was done before posting the patch. It was successful.
This change is now in:
http://gcc.gnu.org/ml/gcc-cvs/2012-10/msg00418.html
and it looks like is breaking the build for me:
build/genattrtab
/scratch/Gcc/svn-dirs/trunk/gcc/config/i386/i386.md insn-conditions.md \
-Atmp-attrtab.c -Dtmp-dfatab.c -Ltmp-latencytab.c
build/genautomata /scratch/Gcc/svn-dirs/trunk/gcc/config/i386/i386.md \
insn-conditions.md > tmp-automata.c
genautomata: automaton `bdver1_mult' is not declared
genautomata: unit `bdver1-mult' is not used
Paolo.
On Wed, Oct 10, 2012 at 12:29:04PM +0200, Paolo Carlini wrote: > Hi, > > On 10/03/2012 11:57 AM, Gopalasubramanian, Ganesh wrote: > >Testing was done before posting the patch. It was successful. > This change is now in: > > http://gcc.gnu.org/ml/gcc-cvs/2012-10/msg00418.html > > and it looks like is breaking the build for me: > > build/genattrtab > /scratch/Gcc/svn-dirs/trunk/gcc/config/i386/i386.md > insn-conditions.md \ > -Atmp-attrtab.c -Dtmp-dfatab.c -Ltmp-latencytab.c > build/genautomata /scratch/Gcc/svn-dirs/trunk/gcc/config/i386/i386.md \ > insn-conditions.md > tmp-automata.c > genautomata: automaton `bdver1_mult' is not declared > genautomata: unit `bdver1-mult' is not used Yeah, clearly a different version of the patch has been posted vs. what has been checked in. The difference is removal of the (define_cpu_unit "bdver1-mult" "bdver1_mult") line (present in the posted patch, not in the checked in patch). Also, the ChangeLog entry is wrong: PR 51109 * gcc/config/i386/bdver1.md (bdver1_int): Automaton has been split to reduce state transitions. - should start with: PR target/51109 * config/i386/bdver1.md and really document the changes it has done, or just not contain (bdver1_int) and document the file changes. And Copyright year hasn't been updated (but this wasn't first commit in 2012, others made the same mistake, no changes in 2011 btw). Jakub
Hi, On 10/10/2012 12:53 PM, Jakub Jelinek wrote: > Yeah, clearly a different version of the patch has been posted > vs. what has been checked in. The difference is removal of the > (define_cpu_unit "bdver1-mult" "bdver1_mult") > line (present in the posted patch, not in the checked in patch). > > Also, the ChangeLog entry is wrong: > PR 51109 > * gcc/config/i386/bdver1.md (bdver1_int): Automaton has been > split to reduce state transitions. > - should start with: > PR target/51109 > * config/i386/bdver1.md > and really document the changes it has done, or just not contain > (bdver1_int) and document the file changes. > > And Copyright year hasn't been updated (but this wasn't first > commit in 2012, others made the same mistake, no changes in 2011 > btw). Shall we (I) just revert the change for now? Thanks, Paolo.
Index: gcc/config/i386/bdver1.md =================================================================== --- gcc/config/i386/bdver1.md (revision 191658) +++ gcc/config/i386/bdver1.md (working copy) @@ -36,7 +36,7 @@ (define_attr "bdver1_decode" "direct,vector,double" (const_string "direct")) -(define_automaton "bdver1,bdver1_int,bdver1_load,bdver1_mult,bdver1_fp") +(define_automaton "bdver1,bdver1_ieu,bdver1_load,bdver1_fp,bdver1_agu") (define_cpu_unit "bdver1-decode0" "bdver1") (define_cpu_unit "bdver1-decode1" "bdver1") @@ -71,16 +71,14 @@ | (nothing,(bdver1-decode1 + bdver1-decode2)))") -(define_cpu_unit "bdver1-ieu0" "bdver1_int") -(define_cpu_unit "bdver1-ieu1" "bdver1_int") +(define_cpu_unit "bdver1-ieu0" "bdver1_ieu") +(define_cpu_unit "bdver1-ieu1" "bdver1_ieu") (define_reservation "bdver1-ieu" "(bdver1-ieu0 | bdver1-ieu1)") -(define_cpu_unit "bdver1-agu0" "bdver1_int") -(define_cpu_unit "bdver1-agu1" "bdver1_int") +(define_cpu_unit "bdver1-agu0" "bdver1_agu") +(define_cpu_unit "bdver1-agu1" "bdver1_agu") (define_reservation "bdver1-agu" "(bdver1-agu0 | bdver1-agu1)") -(define_cpu_unit "bdver1-mult" "bdver1_mult") - (define_cpu_unit "bdver1-load0" "bdver1_load") (define_cpu_unit "bdver1-load1" "bdver1_load") (define_reservation "bdver1-load" "bdver1-agu, @@ -93,6 +91,12 @@ ;; 128bit SSE instructions issue two stores at once. (define_reservation "bdver1-store2" "(bdver1-load0 + bdver1-load1)") +;; vectorpath (microcoded) instructions are single issue instructions. +;; So, they occupy all the integer units. +(define_reservation "bdver1-ivector" "bdver1-ieu0+bdver1-ieu1+ + bdver1-agu0+bdver1-agu1+ + bdver1-load0+bdver1-load1") + ;; The FP operations start to execute at stage 12 in the pipeline, while ;; integer operations start to execute at stage 9 for athlon and 11 for K8 ;; Compensate the difference for athlon because it results in significantly @@ -125,7 +129,7 @@ (define_insn_reservation "bdver1_call" 0 (and (eq_attr "cpu" "bdver1,bdver2") (eq_attr "type" "call,callv")) - "bdver1-double,bdver1-agu,bdver1-ieu") + "bdver1-double,bdver1-agu") ;; PUSH mem is double path. (define_insn_reservation "bdver1_push" 1 (and (eq_attr "cpu" "bdver1,bdver2") @@ -135,17 +139,17 @@ (define_insn_reservation "bdver1_pop" 1 (and (eq_attr "cpu" "bdver1,bdver2") (eq_attr "type" "pop")) - "bdver1-direct,(bdver1-ieu+bdver1-load)") + "bdver1-direct,bdver1-ivector") ;; LEAVE no latency info so far, assume same with amdfam10. (define_insn_reservation "bdver1_leave" 3 (and (eq_attr "cpu" "bdver1,bdver2") (eq_attr "type" "leave")) - "bdver1-vector,(bdver1-ieu+bdver1-load)") + "bdver1-vector,bdver1-ivector") ;; LEA executes in AGU unit with 1 cycle latency on BDVER1. (define_insn_reservation "bdver1_lea" 1 (and (eq_attr "cpu" "bdver1,bdver2") (eq_attr "type" "lea")) - "bdver1-direct,bdver1-agu,nothing") + "bdver1-direct,bdver1-agu") ;; MUL executes in special multiplier unit attached to IEU1. (define_insn_reservation "bdver1_imul_DI" 6 @@ -153,23 +157,23 @@ (and (eq_attr "type" "imul") (and (eq_attr "mode" "DI") (eq_attr "memory" "none,unknown")))) - "bdver1-direct1,bdver1-ieu1,bdver1-mult,nothing,bdver1-ieu1") + "bdver1-direct1,bdver1-ieu1") (define_insn_reservation "bdver1_imul" 4 (and (eq_attr "cpu" "bdver1,bdver2") (and (eq_attr "type" "imul") (eq_attr "memory" "none,unknown"))) - "bdver1-direct1,bdver1-ieu1,bdver1-mult,bdver1-ieu1") + "bdver1-direct1,bdver1-ieu1") (define_insn_reservation "bdver1_imul_mem_DI" 10 (and (eq_attr "cpu" "bdver1,bdver2") (and (eq_attr "type" "imul") (and (eq_attr "mode" "DI") (eq_attr "memory" "load,both")))) - "bdver1-direct1,bdver1-load,bdver1-ieu,bdver1-mult,nothing,bdver1-ieu") + "bdver1-direct1,bdver1-load,bdver1-ieu1") (define_insn_reservation "bdver1_imul_mem" 8 (and (eq_attr "cpu" "bdver1,bdver2") (and (eq_attr "type" "imul") (eq_attr "memory" "load,both"))) - "bdver1-direct1,bdver1-load,bdver1-ieu,bdver1-mult,bdver1-ieu") + "bdver1-direct1,bdver1-load,bdver1-ieu1") ;; IDIV cannot execute in parallel with other instructions. Dealing with it ;; as with short latency vector instruction is good approximation avoiding