diff mbox

[RS6000] cost SLOW_UNALIGNED_ACCESS

Message ID 20160802143506.GG20904@bubble.grove.modra.org
State New
Headers show

Commit Message

Alan Modra Aug. 2, 2016, 2:35 p.m. UTC
As noted in the last patch, rs6000_rtx_costs ought to cost slow
unaligned mems.  This stops combine merging loads/stores with a
mode-changing SET subreg, if the load/store in the subreg mode would
be slow.  Costing slow mems at 100 insns is just an order of magnitude
estimate.  (The alignment interrupt does cost quite a lot.
Experiments on power8 with a misaligned lwarx showed taking the
alignment interrupt cost roughly 300 insns.)

Boostrapped and regression tested powerpc64le-linux and
powerpc64-linux.

	* config/rs6000/rs6000.c (rs6000_rtx_costs): Make unaligned mem
	cost more.

Comments

Segher Boessenkool Aug. 2, 2016, 3:31 p.m. UTC | #1
On Wed, Aug 03, 2016 at 12:05:07AM +0930, Alan Modra wrote:
> As noted in the last patch, rs6000_rtx_costs ought to cost slow
> unaligned mems.  This stops combine merging loads/stores with a
> mode-changing SET subreg, if the load/store in the subreg mode would
> be slow.  Costing slow mems at 100 insns is just an order of magnitude
> estimate.  (The alignment interrupt does cost quite a lot.
> Experiments on power8 with a misaligned lwarx showed taking the
> alignment interrupt cost roughly 300 insns.)

Okay for trunk.  Do you have a testcase, too?


Segher
Alan Modra Aug. 3, 2016, 5:07 a.m. UTC | #2
On Tue, Aug 02, 2016 at 10:31:33AM -0500, Segher Boessenkool wrote:
> On Wed, Aug 03, 2016 at 12:05:07AM +0930, Alan Modra wrote:
> > As noted in the last patch, rs6000_rtx_costs ought to cost slow
> > unaligned mems.  This stops combine merging loads/stores with a
> > mode-changing SET subreg, if the load/store in the subreg mode would
> > be slow.  Costing slow mems at 100 insns is just an order of magnitude
> > estimate.  (The alignment interrupt does cost quite a lot.
> > Experiments on power8 with a misaligned lwarx showed taking the
> > alignment interrupt cost roughly 300 insns.)
> 
> Okay for trunk.  Do you have a testcase, too?

All of yesterday's patches were from investigating
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71680#c1
but that testcase isn't particularly useful past the fix for power8
SLOW_UNALIGNED_ACCESS.
diff mbox

Patch

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 5b9aae2..2ae3e7e 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -34336,11 +34336,16 @@  rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code,
     case CONST:
     case HIGH:
     case SYMBOL_REF:
+      *total = !speed ? COSTS_N_INSNS (1) + 1 : COSTS_N_INSNS (2);
+      return true;
+
     case MEM:
       /* When optimizing for size, MEM should be slightly more expensive
 	 than generating address, e.g., (plus (reg) (const)).
 	 L1 cache latency is about two instructions.  */
       *total = !speed ? COSTS_N_INSNS (1) + 1 : COSTS_N_INSNS (2);
+      if (SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (x)))
+	*total += COSTS_N_INSNS (100);
       return true;
 
     case LABEL_REF: