Message ID | 9714dd4b-6beb-4007-9bf4-c9ee80ab5ad6@BAMAIL02.ba.imgtec.org |
---|---|
State | New |
Headers | show |
"Steve Ellcey " <sellcey@imgtec.com> writes: > While looking at some matrix code I noticed that the generic mips scheduler > was putting out a bunch of integer madd instructions in a row and that > I got better performance (on 74kc and proAptiv) when they were spread out. > > I was wondering what folks thought of this change to specify that the > integer madd instruction uses the alu and the imuldiv functional units? > > Tested on 74kc and proAptive with no regressions and better performance. Hmm, but generic.md is very much legacy and shouldn't be used for vaguely modern cores. Even something like -mips32 is supposed to avoid the generic scheduler; it should use the 4k scheduler instead. What options were you using to trigger it? generic.md is actually inherited by some of the very old, pre-DFA schedulers, including those that were probably used with the old PMC cores. I'm not even sure how much generic.md was ever supposed to be a good "all-round" scheduling description, akin to the modern -mtune=generic on x86 targets. I'm a bit reluctant to change it at this stage. It might be worth having a new "generic" scheduler that's supposed to be a good compromise for modern cores though. Or, more simply, we could just change: MIPS_CPU ("mips32", PROCESSOR_4KC, 32, PTF_AVOID_BRANCHLIKELY) MIPS_CPU ("mips32r2", PROCESSOR_M4K, 33, PTF_AVOID_BRANCHLIKELY) MIPS_CPU ("mips64", PROCESSOR_5KC, 64, PTF_AVOID_BRANCHLIKELY) /* ??? For now just tune the generic MIPS64r2 for 5KC as well. */ MIPS_CPU ("mips64r2", PROCESSOR_5KC, 65, PTF_AVOID_BRANCHLIKELY) to tune for other processors instead, if you don't think 4kc, etc. are representative enough. Richard
On Tue, 2013-05-21 at 19:55 +0100, Richard Sandiford wrote: > Hmm, but generic.md is very much legacy and shouldn't be used for > vaguely modern cores. Even something like -mips32 is supposed to avoid > the generic scheduler; it should use the 4k scheduler instead. > What options were you using to trigger it? I am just doing a default compilation of my mips-mti-linux-gnu target. This defaults to mips32r2. I just tried adding -mtune=mips32r2 and that made no difference, -mtune=4k also makes no difference, but -mtune=4kc does affect the output. > It might be worth having a new "generic" scheduler that's supposed to be > a good compromise for modern cores though. Or, more simply, we could just > change: > > MIPS_CPU ("mips32", PROCESSOR_4KC, 32, PTF_AVOID_BRANCHLIKELY) > MIPS_CPU ("mips32r2", PROCESSOR_M4K, 33, PTF_AVOID_BRANCHLIKELY) > MIPS_CPU ("mips64", PROCESSOR_5KC, 64, PTF_AVOID_BRANCHLIKELY) > /* ??? For now just tune the generic MIPS64r2 for 5KC as well. */ > MIPS_CPU ("mips64r2", PROCESSOR_5KC, 65, PTF_AVOID_BRANCHLIKELY) > > to tune for other processors instead, if you don't think 4kc, etc. are > representative enough. Hm, I think the problem may be that mips32r2 defaults to PROCESSOR_M4K and mips32 defaults to PROCESSOR_4KC. I don't see any special scheduler for m4k. Is there supposed to be a scheduler for m4k? Steve Ellcey
Steve Ellcey <sellcey@imgtec.com> writes: >> It might be worth having a new "generic" scheduler that's supposed to be >> a good compromise for modern cores though. Or, more simply, we could just >> change: >> >> MIPS_CPU ("mips32", PROCESSOR_4KC, 32, PTF_AVOID_BRANCHLIKELY) >> MIPS_CPU ("mips32r2", PROCESSOR_M4K, 33, PTF_AVOID_BRANCHLIKELY) >> MIPS_CPU ("mips64", PROCESSOR_5KC, 64, PTF_AVOID_BRANCHLIKELY) >> /* ??? For now just tune the generic MIPS64r2 for 5KC as well. */ >> MIPS_CPU ("mips64r2", PROCESSOR_5KC, 65, PTF_AVOID_BRANCHLIKELY) >> >> to tune for other processors instead, if you don't think 4kc, etc. are >> representative enough. > > Hm, I think the problem may be that mips32r2 defaults to PROCESSOR_M4K > and mips32 defaults to PROCESSOR_4KC. I don't see any special scheduler > for m4k. Is there supposed to be a scheduler for m4k? Oops -- only if someone submitted one :-) So we should definitely change the mips32r2 entry. I'd suggest one of PROCESSOR_24KF* or PROCESSOR_74KF*, so that we get the FPU scheduling, but I don't know which would be more representative of the general case. TUNE_74* has quite a lot of special code associated with it, whereas TUNE_24* sets TUNE_MACC_CHAINS, which might overemphasise the use of MADD for 74k targets when -mimadd is used. But any choice is going to be a compromise. A patch to do either is preapproved. Thanks, Richard
diff --git a/gcc/config/mips/generic.md b/gcc/config/mips/generic.md index 6051630..f187e0b 100644 --- a/gcc/config/mips/generic.md +++ b/gcc/config/mips/generic.md @@ -47,9 +47,13 @@ "imuldiv*3") (define_insn_reservation "generic_imul" 17 - (eq_attr "type" "imul,imul3,imadd") + (eq_attr "type" "imul,imul3") "imuldiv*17") +(define_insn_reservation "generic_imadd" 17 + (eq_attr "type" "imadd") + "(alu+imuldiv)*17") + (define_insn_reservation "generic_idiv" 38 (eq_attr "type" "idiv") "imuldiv*38")