Message ID | 4FCE6A2C.6070803@freescale.com |
---|---|
State | New |
Headers | show |
On Tue, Jun 5, 2012 at 4:21 PM, Edmar <edmar@freescale.com> wrote: > David, Michael, > > Here is the new type "popcnt" patch that I had separated from previous > E5500/E6500 submission, also added the changes suggested by Michael > Meissner (detailed bellow). > I am missing some details for power6. (Could not find any documentation) > > Bootstrapped with no regressions, all languages enabled, configured > for target powerpc64 and used "--with-cpu=<>" for each of power6, power7, > and 970. > All work performed on svn revison number 188200. No. As Mike and I both explained, this is a standard part of an ISA level. We are not going to start supporting finer granularity. If the Freescale processors do not support the popcnt instruction then they do not support that level of the ISA. - David
On 06/06/2012 08:57 AM, David Edelsohn wrote: > On Tue, Jun 5, 2012 at 4:21 PM, Edmar<edmar@freescale.com> wrote: >> David, Michael, >> >> Here is the new type "popcnt" patch that I had separated from previous >> E5500/E6500 submission, also added the changes suggested by Michael >> Meissner (detailed bellow). >> I am missing some details for power6. (Could not find any documentation) >> >> Bootstrapped with no regressions, all languages enabled, configured >> for target powerpc64 and used "--with-cpu=<>" for each of power6, power7, >> and 970. >> All work performed on svn revison number 188200. > No. > > As Mike and I both explained, this is a standard part of an ISA level. > We are not going to start supporting finer granularity. If the > Freescale processors do not support the popcnt instruction then they > do not support that level of the ISA. > > - David > > . I agree, I understood this from the first time. This patch is for IBM parts exclusively. I am posting it as Michael manifested interest to have popcnt scheduling control for those IBM parts, and this happened while he was reviewing my E5500/E6500 patch. I understand if this patch still not right. I just don't want to leave a wrong impression. Thanks, Edmar
On Tue, Jun 05, 2012 at 04:21:00PM -0400, Edmar wrote: > David, Michael, > > Here is the new type "popcnt" patch that I had separated from previous > E5500/E6500 submission, also added the changes suggested by Michael > Meissner (detailed bellow). > I am missing some details for power6. (Could not find any documentation) > > Bootstrapped with no regressions, all languages enabled, configured > for target powerpc64 and used "--with-cpu=<>" for each of power6, > power7, and 970. > All work performed on svn revison number 188200. > > NOTES: > - 403, and 440 manuals does not list popcnt* instructions. > Skipped. > - 750, 74xx Freescale parts does not have popcnt* instructions. > Skipped. > - 476 manual lists popcnt as requiring i-pipe. > Added to corresponding insn reservation. > - power4 (IBM 970) pre-dates ISA-2.02. It does not have popcnt* > instructions. > Skipped. > - power5, power7 groups simple integer and complex integer together. > Appended popcnt to insn reservation. > - power6.md has different style. Created a separate reservation. > I used instruction latency of 1. Please confirm. > I did not added a store bypass either. Let me know if I should. I do think it would be useful to have a popcnt insn type. power6 is internally an in-order machine, while power4, power5 and power7 are out of order machines. Popcntb is a 1 cycle instruction, but it can cause stalls in some cases.
On Wed, Jun 6, 2012 at 12:19 PM, Edmar <edmar@freescale.com> wrote: > On 06/06/2012 08:57 AM, David Edelsohn wrote: >> >> On Tue, Jun 5, 2012 at 4:21 PM, Edmar<edmar@freescale.com> wrote: >>> >>> David, Michael, >>> >>> Here is the new type "popcnt" patch that I had separated from previous >>> E5500/E6500 submission, also added the changes suggested by Michael >>> Meissner (detailed bellow). >>> I am missing some details for power6. (Could not find any documentation) >>> >>> Bootstrapped with no regressions, all languages enabled, configured >>> for target powerpc64 and used "--with-cpu=<>" for each of power6, power7, >>> and 970. >>> All work performed on svn revison number 188200. 2012-06-05 Edmar Wienskoski <edmar@freescale.com> * config/rs6000/rs6000.md (define_attr "type"): New type popcnt. (popcntb<mode>2): Add attribute type popcnt. (popcntd<mode>2): Ditto. * config/rs6000/power4.md (define_insn_reservation): Add type popcnt. * config/rs6000/power5.md (define_insn_reservation): Ditto. * config/rs6000/power7.md (define_insn_reservation): Ditto. * config/rs6000/476.md (define_insn_reservation): Ditto. * config/rs6000/power6.md (define_insn_reservation): New reservation for popcnt instructions. Sorry, I thought that the patch added an additional target flags. The patch to add a "popcnt" instruction attribute is okay and a 1 cycle latency for POWER processors is fine. Thanks, David
Index: gcc-20120604/gcc/config/rs6000/476.md =================================================================== --- gcc-20120604/gcc/config/rs6000/476.md (revision 188200) +++ gcc-20120604/gcc/config/rs6000/476.md (working copy) @@ -71,7 +71,7 @@ ppc476_i_pipe|ppc476_lj_pipe") (define_insn_reservation "ppc476-complex-integer" 1 - (and (eq_attr "type" "cmp,cr_logical,delayed_cr,cntlz,isel,isync,sync,trap") + (and (eq_attr "type" "cmp,cr_logical,delayed_cr,cntlz,isel,isync,sync,trap,popcnt") (eq_attr "cpu" "ppc476")) "ppc476_issue,\ ppc476_i_pipe") Index: gcc-20120604/gcc/config/rs6000/power7.md =================================================================== --- gcc-20120604/gcc/config/rs6000/power7.md (revision 188200) +++ gcc-20120604/gcc/config/rs6000/power7.md (working copy) @@ -150,7 +150,7 @@ ; FX Unit (define_insn_reservation "power7-integer" 1 (and (eq_attr "type" "integer,insert_word,insert_dword,shift,trap,\ - var_shift_rotate,exts,isel") + var_shift_rotate,exts,isel,popcnt") (eq_attr "cpu" "power7")) "DU_power7,FXU_power7") Index: gcc-20120604/gcc/config/rs6000/power6.md =================================================================== --- gcc-20120604/gcc/config/rs6000/power6.md (revision 188200) +++ gcc-20120604/gcc/config/rs6000/power6.md (working copy) @@ -216,6 +216,11 @@ (eq_attr "cpu" "power6")) "FXU_power6") +(define_insn_reservation "power6-popcnt" 1 + (and (eq_attr "type" "popcnt") + (eq_attr "cpu" "power6")) + "FXU_power6") + (define_insn_reservation "power6-insert" 1 (and (eq_attr "type" "insert_word") (eq_attr "cpu" "power6")) Index: gcc-20120604/gcc/config/rs6000/power5.md =================================================================== --- gcc-20120604/gcc/config/rs6000/power5.md (revision 188200) +++ gcc-20120604/gcc/config/rs6000/power5.md (working copy) @@ -142,7 +142,7 @@ ; Integer latency is 2 cycles (define_insn_reservation "power5-integer" 2 (and (eq_attr "type" "integer,insert_dword,shift,trap,\ - var_shift_rotate,cntlz,exts,isel") + var_shift_rotate,cntlz,exts,isel,popcnt") (eq_attr "cpu" "power5")) "iq_power5") Index: gcc-20120604/gcc/config/rs6000/rs6000.md =================================================================== --- gcc-20120604/gcc/config/rs6000/rs6000.md (revision 188200) +++ gcc-20120604/gcc/config/rs6000/rs6000.md (working copy) @@ -145,7 +145,7 @@ ;; Define an insn type attribute. This is used in function unit delay ;; computations. -(define_attr "type" "integer,two,three,load,load_ext,load_ext_u,load_ext_ux,load_ux,load_u,store,store_ux,store_u,fpload,fpload_ux,fpload_u,fpstore,fpstore_ux,fpstore_u,vecload,vecstore,imul,imul2,imul3,lmul,idiv,ldiv,insert_word,branch,cmp,fast_compare,compare,var_delayed_compare,delayed_compare,imul_compare,lmul_compare,fpcompare,cr_logical,delayed_cr,mfcr,mfcrf,mtcr,mfjmpr,mtjmpr,fp,fpsimple,dmul,sdiv,ddiv,ssqrt,dsqrt,jmpreg,brinc,vecsimple,veccomplex,vecdiv,veccmp,veccmpsimple,vecperm,vecfloat,vecfdiv,vecdouble,isync,sync,load_l,store_c,shift,trap,insert_dword,var_shift_rotate,cntlz,exts,mffgpr,mftgpr,isel" +(define_attr "type" "integer,two,three,load,load_ext,load_ext_u,load_ext_ux,load_ux,load_u,store,store_ux,store_u,fpload,fpload_ux,fpload_u,fpstore,fpstore_ux,fpstore_u,vecload,vecstore,imul,imul2,imul3,lmul,idiv,ldiv,insert_word,branch,cmp,fast_compare,compare,var_delayed_compare,delayed_compare,imul_compare,lmul_compare,fpcompare,cr_logical,delayed_cr,mfcr,mfcrf,mtcr,mfjmpr,mtjmpr,fp,fpsimple,dmul,sdiv,ddiv,ssqrt,dsqrt,jmpreg,brinc,vecsimple,veccomplex,vecdiv,veccmp,veccmpsimple,vecperm,vecfloat,vecfdiv,vecdouble,isync,sync,load_l,store_c,shift,trap,insert_dword,var_shift_rotate,cntlz,exts,mffgpr,mftgpr,isel,popcnt" (const_string "integer")) ;; Define floating point instruction sub-types for use with Xfpu.md @@ -2330,13 +2330,17 @@ (unspec:GPR [(match_operand:GPR 1 "gpc_reg_operand" "r")] UNSPEC_POPCNTB))] "TARGET_POPCNTB" - "popcntb %0,%1") + "popcntb %0,%1" + [(set_attr "length" "4") + (set_attr "type" "popcnt")]) (define_insn "popcntd<mode>2" [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") (popcount:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")))] "TARGET_POPCNTD" - "popcnt<wd> %0,%1") + "popcnt<wd> %0,%1" + [(set_attr "length" "4") + (set_attr "type" "popcnt")]) (define_expand "popcount<mode>2" [(set (match_operand:GPR 0 "gpc_reg_operand" "")