Patchwork Fix PR57540, try to choose scaled_offset address mode when expanding array reference

login
register
mail settings
Submitter Bin Cheng
Date June 14, 2013, 10:18 a.m.
Message ID <00ca01ce68e8$8e67ed00$ab37c700$@cheng@arm.com>
Download mbox | patch
Permalink /patch/251322/
State New
Headers show

Comments

Bin Cheng - June 14, 2013, 10:18 a.m.
Hi,
As reported in pr57540, gcc chooses bad address mode, resulting in A)
invariant part of address expression is not kept or hoisted; b) additional
computation which should be encoded in address expression.  The reason is
when gcc runs into "addr+offset" (which is invalid) during expanding, it
pre-computes the entire address and accesses memory unit using "MEM[reg]".
Yet we can force addr into register and try to generate "reg+offset" which
is valid for targets like ARM.  By doing this, we can:
1) keep addr in loop invariant form and hoist it later;
2) saving additional computation by taking advantage of scaled addressing
mode;

This issue has substantial impact for ARM mode, and also benefits Thumb2
although not so much as ARM mode.  For example from the bug entry, assembly
code like:
	blt	.L3
.L5:
	add	lr, sp, #2064            ////loop invariant
	add	r2, r2, #1
	add	r3, lr, r3, asl #2
	ldr	r3, [r3, #-2064]
	cmp	r3, #0
	bge	.L5
	uxtb	r2, r2

can be optimized into:

	blt	.L3	
.L5:
	add	r2, r2, #1
	ldr	r3, [sp, r3, asl #2]
	cmp	r3, #0
	bge	.L5
	uxtb	r2, r2

Bootstrap and test on x86/arm, any comments?

Thanks.
bin

2013-06-13  Bin Cheng  <bin.cheng@arm.com>

	PR target/57540
	* emit-rtl.c (offset_address): Try to force ADDR into register and
	generate reg+offset if addr+offset is invalid.

Patch

Index: gcc/emit-rtl.c
===================================================================
--- gcc/emit-rtl.c	(revision 199949)
+++ gcc/emit-rtl.c	(working copy)
@@ -2175,15 +2175,20 @@  offset_address (rtx memref, rtx offset, unsigned H
 
   /* At this point we don't know _why_ the address is invalid.  It
      could have secondary memory references, multiplies or anything.
+     Yet we can try to force addr into register, in order to catch
+     the scaled addressing opportunity as "reg + scaled_offset".
 
-     However, if we did go and rearrange things, we can wind up not
+     Otherwise, if we did go and rearrange things, we can wind up not
      being able to recognize the magic around pic_offset_table_rtx.
      This stuff is fragile, and is yet another example of why it is
-     bad to expose PIC machinery too early.  */
+     bad to expose PIC machinery too early.  We may also wind up not
+     being able to recognize the scaled addressing pattern.
+
+     It won't hurt because the address here is invalid and we are
+     going to pre-compute it anyway.  */
   if (! memory_address_addr_space_p (GET_MODE (memref), new_rtx,
 				     attrs.addrspace)
-      && GET_CODE (addr) == PLUS
-      && XEXP (addr, 0) == pic_offset_table_rtx)
+      && GET_CODE (addr) == PLUS)
     {
       addr = force_reg (GET_MODE (addr), addr);
       new_rtx = simplify_gen_binary (PLUS, address_mode, addr, offset);