Patchwork [i386] : Rewrite LEA handling (was:Re: PATCH [10/n] X32: Support x32 LEA insns)

login
register
mail settings
Submitter Uros Bizjak
Date July 25, 2011, 12:33 p.m.
Message ID <CAFULd4affMGiqhgWS5MeHyaMP2V526pHipbzeDkCNaNj8bsLoA@mail.gmail.com>
Download mbox | patch
Permalink /patch/106651/
State New
Headers show

Comments

Uros Bizjak - July 25, 2011, 12:33 p.m.
On Mon, Jul 25, 2011 at 3:58 AM, H.J. Lu <hjl.tools@gmail.com> wrote:

>> You are not fixing the core of the problem... this is why you need so
>> much hacks and kludges at various places (some w.r.t. -fPIC already
>> existed, see the patch). Above, you correctly identified the problem,
>> so let's avoid gen_lowpart on SImode operands by not calling it
>> anymore.
>>
>> Attached patch effectively rewrites LEA handling. The trick is, that
>> instead of using Pmode operations in addresses, we use either SImode
>> or DImode operations to calculate the address on 64bit targets. Up to
>> now, address calculations strictly used Pmode, so SImode on 32bit
>> targets and DImode on 64bit targets. Recent patches to
>> ix86_decompose_address and ix86_legitimate_address_p relaxed this
>> requirement.
>>
>> Attached patch changes LEA patterns and LEA splitters to accept
>> addresses, calculated with either SImode or DImode operations.This
>> means, that on x64 targets, we don't use gen_lowpart on SImode
>> operands anymore. Since symbol references on x32 are in SImode, this
>> solves the problem. The patch also avoids generating SImode subregs of
>> DImode addresses and DImode zero_extends of SImode addresses, since
>> LEA insn does this for us automatically.
>>
>> Please also note the change to ix86_print_operand_address. To avoid
>> addr32 prefixes, we can force registers in DImode on 64bit targets
>> without any problems. On x32, we can investigate, if this change
>> avoids unnecessary LEAs (for PR 49781, patched gcc genrates 6 vs. 8).
>
> The testcase won't compile since PIC doesn't work:

Well, I did say that -fPIC did not work.

>> Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu
>> {,-m32} with no regressions. H.J., can you please test it on x32?
>
> On x32, it failed:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49832
>
>> BTW: -fPIC is not yet implemented on trunk and still fails there with
>> an (unrelated) error, I didn't check x32 branch.
>>
>
> This could be:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49833

Attached patch implements -fpic handling for x32. In x32 mode, we now
use x86_64_general_operand and corresponding "e" constraints for adds
in SImode, since it looks that invalid addresses can only be generated
through adds. This avoids a whole bunch of new predicates and
constraints.

2011-07-25  Uros Bizjak  <ubizjak@gmail.com>

	PR target/47381
	PR target/49832
	PR target/49833
	* config/i386/i386.md (add_operand): New mode attribute.
	(*movdi_internal_rex64): Remove mode from pic_32bit_operand check.
	(*movsi_internal): Ditto.  Use "e" constraint in alternative 2.
	(*lea_1): Use SWI48 mode iterator.
	(*lea_1_zext): New insn pattern.
	(add<mode>3): Use <add_operand> predicate for operand 2.
	(*add<mode>_1): Use <add_operand> predicate for operand 2.  Use "le"
	constraint for alternative 2.
	(addsi_1_zext): Use addsi_operand predicate for operand 2.  Use "le"
	constraint for alternative 2.
	(add->lea splitter): Check operand modes in insn constraint.  Extend
	operands less than SImode wide to SImode.
	(add->lea zext splitter): Do not extend operands to DImode.
	(*lea_general_1): Handle only QImode and HImode operands.
	(*lea_general_2): Ditto.
	(*lea_general_3): Ditto.
	(*lea_general_1_zext): Remove.
	(*lea_general_2_zext): Ditto.
	(*lea_general_3_zext): Ditto.
	(*lea_general_4): Check operand modes in insn constraint.  Extend
	operands less than SImode wide to SImode.
	(ashift->lea splitter): Ditto.
	* config/i386/i386.c (ix86_print_operand_address): Print address
	registers with 'q' modifier on 64bit targets.
	* config/i386/predicates.md (pic_32bit_opreand): Define as special
	predicate.  Reject non-SI and non-DI modes.
	(addsi_operand): New predicate.

Uros.
H.J. Lu - July 25, 2011, 1:30 p.m.
On Mon, Jul 25, 2011 at 5:33 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Mon, Jul 25, 2011 at 3:58 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>
>>> You are not fixing the core of the problem... this is why you need so
>>> much hacks and kludges at various places (some w.r.t. -fPIC already
>>> existed, see the patch). Above, you correctly identified the problem,
>>> so let's avoid gen_lowpart on SImode operands by not calling it
>>> anymore.
>>>
>>> Attached patch effectively rewrites LEA handling. The trick is, that
>>> instead of using Pmode operations in addresses, we use either SImode
>>> or DImode operations to calculate the address on 64bit targets. Up to
>>> now, address calculations strictly used Pmode, so SImode on 32bit
>>> targets and DImode on 64bit targets. Recent patches to
>>> ix86_decompose_address and ix86_legitimate_address_p relaxed this
>>> requirement.
>>>
>>> Attached patch changes LEA patterns and LEA splitters to accept
>>> addresses, calculated with either SImode or DImode operations.This
>>> means, that on x64 targets, we don't use gen_lowpart on SImode
>>> operands anymore. Since symbol references on x32 are in SImode, this
>>> solves the problem. The patch also avoids generating SImode subregs of
>>> DImode addresses and DImode zero_extends of SImode addresses, since
>>> LEA insn does this for us automatically.
>>>
>>> Please also note the change to ix86_print_operand_address. To avoid
>>> addr32 prefixes, we can force registers in DImode on 64bit targets
>>> without any problems. On x32, we can investigate, if this change
>>> avoids unnecessary LEAs (for PR 49781, patched gcc genrates 6 vs. 8).
>>
>> The testcase won't compile since PIC doesn't work:
>
> Well, I did say that -fPIC did not work.
>
>>> Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu
>>> {,-m32} with no regressions. H.J., can you please test it on x32?
>>
>> On x32, it failed:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49832
>>
>>> BTW: -fPIC is not yet implemented on trunk and still fails there with
>>> an (unrelated) error, I didn't check x32 branch.
>>>
>>
>> This could be:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49833
>
> Attached patch implements -fpic handling for x32. In x32 mode, we now
> use x86_64_general_operand and corresponding "e" constraints for adds
> in SImode, since it looks that invalid addresses can only be generated
> through adds. This avoids a whole bunch of new predicates and
> constraints.
>
> 2011-07-25  Uros Bizjak  <ubizjak@gmail.com>
>
>        PR target/47381
>        PR target/49832
>        PR target/49833
>        * config/i386/i386.md (add_operand): New mode attribute.
>        (*movdi_internal_rex64): Remove mode from pic_32bit_operand check.
>        (*movsi_internal): Ditto.  Use "e" constraint in alternative 2.
>        (*lea_1): Use SWI48 mode iterator.
>        (*lea_1_zext): New insn pattern.
>        (add<mode>3): Use <add_operand> predicate for operand 2.
>        (*add<mode>_1): Use <add_operand> predicate for operand 2.  Use "le"
>        constraint for alternative 2.
>        (addsi_1_zext): Use addsi_operand predicate for operand 2.  Use "le"
>        constraint for alternative 2.
>        (add->lea splitter): Check operand modes in insn constraint.  Extend
>        operands less than SImode wide to SImode.
>        (add->lea zext splitter): Do not extend operands to DImode.
>        (*lea_general_1): Handle only QImode and HImode operands.
>        (*lea_general_2): Ditto.
>        (*lea_general_3): Ditto.
>        (*lea_general_1_zext): Remove.
>        (*lea_general_2_zext): Ditto.
>        (*lea_general_3_zext): Ditto.
>        (*lea_general_4): Check operand modes in insn constraint.  Extend
>        operands less than SImode wide to SImode.
>        (ashift->lea splitter): Ditto.
>        * config/i386/i386.c (ix86_print_operand_address): Print address
>        registers with 'q' modifier on 64bit targets.
>        * config/i386/predicates.md (pic_32bit_opreand): Define as special
>        predicate.  Reject non-SI and non-DI modes.
>        (addsi_operand): New predicate.
>
> Uros.
>

X32 glibc is miscompiled:

CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
 -E -x c-header'
/export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
--library-path /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
../scripts -h rpcsvc/yppasswd.x -o
/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
make[5]: *** [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.stmp]
Segmentation fault (core dumped)

Some LEA patterns are wrong for x32.  I will investigate.
Uros Bizjak - July 25, 2011, 1:43 p.m.
On Mon, Jul 25, 2011 at 3:30 PM, H.J. Lu <hjl.tools@gmail.com> wrote:

>> Attached patch implements -fpic handling for x32. In x32 mode, we now
>> use x86_64_general_operand and corresponding "e" constraints for adds
>> in SImode, since it looks that invalid addresses can only be generated
>> through adds. This avoids a whole bunch of new predicates and
>> constraints.

> X32 glibc is miscompiled:
>
> CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
>  -E -x c-header'
> /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
> --library-path /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
> ../scripts -h rpcsvc/yppasswd.x -o
> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
> make[5]: *** [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.stmp]
> Segmentation fault (core dumped)
>
> Some LEA patterns are wrong for x32.  I will investigate.

What about x32 GCC testsuite?

Uros.
H.J. Lu - July 25, 2011, 2:09 p.m.
On Mon, Jul 25, 2011 at 6:43 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Mon, Jul 25, 2011 at 3:30 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>
>>> Attached patch implements -fpic handling for x32. In x32 mode, we now
>>> use x86_64_general_operand and corresponding "e" constraints for adds
>>> in SImode, since it looks that invalid addresses can only be generated
>>> through adds. This avoids a whole bunch of new predicates and
>>> constraints.
>
>> X32 glibc is miscompiled:
>>
>> CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
>>  -E -x c-header'
>> /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
>> --library-path /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
>> ../scripts -h rpcsvc/yppasswd.x -o
>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
>> make[5]: *** [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.stmp]
>> Segmentation fault (core dumped)
>>
>> Some LEA patterns are wrong for x32.  I will investigate.
>
> What about x32 GCC testsuite?
>

GCC testsuite is clean on x32.

Patch

Index: i386.md
===================================================================
--- i386.md	(revision 176733)
+++ i386.md	(working copy)
@@ -901,6 +901,14 @@ 
 	 (SI "nonmemory_operand")
 	 (DI "x86_64_nonmemory_operand")])
 
+;; Operand predicate for adds.
+(define_mode_attr add_operand
+	[(QI "general_operand")
+	 (HI "general_operand")
+	 (SI "addsi_operand")
+	 (DI "x86_64_general_operand")
+	 (TI "x86_64_general_operand")])
+
 ;; Operand predicate for shifts.
 (define_mode_attr shift_operand
 	[(QI "nonimmediate_operand")
@@ -2039,7 +2047,7 @@ 
 	      (const_string "ssemov")
 	    (eq_attr "alternative" "16,17")
 	      (const_string "ssecvt")
- 	    (match_operand:DI 1 "pic_32bit_operand" "")
+ 	    (match_operand 1 "pic_32bit_operand" "")
 	      (const_string "lea")
 	   ]
 	   (const_string "imov")))
@@ -2184,7 +2192,7 @@ 
   [(set (match_operand:SI 0 "nonimmediate_operand"
 			"=r,m ,*y,*y,?rm,?*y,*x,*x,?r ,m ,?*Yi,*x")
 	(match_operand:SI 1 "general_operand"
-			"g ,ri,C ,*y,*y ,rm ,C ,*x,*Yi,*x,r   ,m "))]
+			"g ,re,C ,*y,*y ,rm ,C ,*x,*Yi,*x,r   ,m "))]
   "!(MEM_P (operands[0]) && MEM_P (operands[1]))"
 {
   switch (get_attr_type (insn))
@@ -2232,7 +2240,7 @@ 
 	      (const_string "sselog1")
 	    (eq_attr "alternative" "7,8,9,10,11")
 	      (const_string "ssemov")
- 	    (match_operand:DI 1 "pic_32bit_operand" "")
+ 	    (match_operand 1 "pic_32bit_operand" "")
 	      (const_string "lea")
 	   ]
 	   (const_string "imov")))
@@ -5371,7 +5379,7 @@ 
 (define_expand "add<mode>3"
   [(set (match_operand:SDWIM 0 "nonimmediate_operand" "")
 	(plus:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand" "")
-		    (match_operand:SDWIM 2 "<general_operand>" "")))]
+		    (match_operand:SDWIM 2 "<add_operand>" "")))]
   ""
   "ix86_expand_binary_operator (PLUS, <MODE>mode, operands); DONE;")
 
@@ -5425,13 +5433,22 @@ 
    (set_attr "mode" "QI")])
 
 (define_insn "*lea_1"
-  [(set (match_operand:P 0 "register_operand" "=r")
-	(match_operand:P 1 "no_seg_address_operand" "p"))]
+  [(set (match_operand:SWI48 0 "register_operand" "=r")
+	(match_operand:SWI48 1 "no_seg_address_operand" "p"))]
   ""
   "lea{<imodesuffix>}\t{%a1, %0|%0, %a1}"
   [(set_attr "type" "lea")
    (set_attr "mode" "<MODE>")])
 
+(define_insn "*lea_1_zext"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(zero_extend:DI
+	  (match_operand:SI 1 "no_seg_address_operand" "p")))]
+  "TARGET_64BIT"
+  "lea{l}\t{%a1, %k0|%k0, %a1}"
+  [(set_attr "type" "lea")
+   (set_attr "mode" "SI")])
+
 (define_insn "*lea_2"
   [(set (match_operand:SI 0 "register_operand" "=r")
 	(subreg:SI (match_operand:DI 1 "no_seg_address_operand" "p") 0))]
@@ -5453,7 +5470,7 @@ 
   [(set (match_operand:SWI48 0 "nonimmediate_operand" "=r,rm,r,r")
 	(plus:SWI48
 	  (match_operand:SWI48 1 "nonimmediate_operand" "%0,0,r,r")
-	  (match_operand:SWI48 2 "<general_operand>" "<g>,r<i>,0,l<i>")))
+	  (match_operand:SWI48 2 "<add_operand>" "<g>,r<i>,0,le")))
    (clobber (reg:CC FLAGS_REG))]
   "ix86_binary_operator_ok (PLUS, <MODE>mode, operands)"
 {
@@ -5512,7 +5529,7 @@ 
   [(set (match_operand:DI 0 "register_operand" "=r,r,r")
 	(zero_extend:DI
 	  (plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0,r,r")
-		   (match_operand:SI 2 "general_operand" "g,0,li"))))
+		   (match_operand:SI 2 "addsi_operand" "g,0,le"))))
    (clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands)"
 {
@@ -5794,39 +5811,36 @@ 
         (const_string "none")))
    (set_attr "mode" "QI")])
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert add to the lea pattern to avoid flags dependency.
 (define_split
   [(set (match_operand 0 "register_operand" "")
 	(plus (match_operand 1 "register_operand" "")
               (match_operand 2 "nonmemory_operand" "")))
    (clobber (reg:CC FLAGS_REG))]
-  "reload_completed && ix86_lea_for_add_ok (insn, operands)" 
+  "GET_MODE (operands[0]) == GET_MODE (operands[1])
+   && (GET_MODE (operands[0]) == GET_MODE (operands[2])
+       || GET_MODE (operands[2]) == VOIDmode)
+   && reload_completed && ix86_lea_for_add_ok (insn, operands)" 
   [(const_int 0)]
 {
-  rtx pat;
   enum machine_mode mode = GET_MODE (operands[0]);
-
-  /* In -fPIC mode the constructs like (const (unspec [symbol_ref]))
-     may confuse gen_lowpart.  */
-  if (mode != Pmode)
-    {
-      operands[1] = gen_lowpart (Pmode, operands[1]);
-      operands[2] = gen_lowpart (Pmode, operands[2]);
-    }
-
-  pat = gen_rtx_PLUS (Pmode, operands[1], operands[2]);
+  rtx pat;
 
   if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (SImode))
-    operands[0] = gen_lowpart (SImode, operands[0]);
+    { 
+      mode = SImode; 
+      operands[0] = gen_lowpart (mode, operands[0]);
+      operands[1] = gen_lowpart (mode, operands[1]);
+      operands[2] = gen_lowpart (mode, operands[2]);
+    }
 
-  if (TARGET_64BIT && mode != Pmode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+  pat = gen_rtx_PLUS (mode, operands[1], operands[2]);
 
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 })
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert add to the lea pattern to avoid flags dependency.
 ;; ??? This pattern handles immediate operands that do not satisfy immediate
 ;; operand predicate (TARGET_LEGITIMATE_CONSTANT_P) in the previous pattern.
 (define_split
@@ -5839,7 +5853,7 @@ 
   [(set (match_dup 0)
 	(plus:DI (match_dup 1) (match_dup 2)))])
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert add to the lea pattern to avoid flags dependency.
 (define_split
   [(set (match_operand:DI 0 "register_operand" "")
 	(zero_extend:DI
@@ -5849,11 +5863,7 @@ 
   "TARGET_64BIT && reload_completed
    && ix86_lea_for_add_ok (insn, operands)"
   [(set (match_dup 0)
-	(zero_extend:DI (subreg:SI (plus:DI (match_dup 1) (match_dup 2)) 0)))]
-{
-  operands[1] = gen_lowpart (DImode, operands[1]);
-  operands[2] = gen_lowpart (DImode, operands[2]);
-})
+	(zero_extend:DI (plus:SI (match_dup 1) (match_dup 2))))])
 
 (define_insn "*add<mode>_2"
   [(set (reg FLAGS_REG)
@@ -6233,7 +6243,7 @@ 
   [(set_attr "type" "alu")
    (set_attr "mode" "QI")])
 
-;; The lea patterns for non-Pmodes needs to be matched by
+;; The lea patterns for modes less than 32 bits need to be matched by
 ;; several insns converted to real lea by splitters.
 
 (define_insn_and_split "*lea_general_1"
@@ -6241,8 +6251,7 @@ 
 	(plus (plus (match_operand 1 "index_register_operand" "l")
 		    (match_operand 2 "register_operand" "r"))
 	      (match_operand 3 "immediate_operand" "i")))]
-  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode
-    || (TARGET_64BIT && GET_MODE (operands[0]) == SImode))
+  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode)
    && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
    && GET_MODE (operands[0]) == GET_MODE (operands[1])
    && GET_MODE (operands[0]) == GET_MODE (operands[2])
@@ -6252,53 +6261,30 @@ 
   "&& reload_completed"
   [(const_int 0)]
 {
+  enum machine_mode mode = SImode;
   rtx pat;
-  operands[0] = gen_lowpart (SImode, operands[0]);
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[2] = gen_lowpart (Pmode, operands[2]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-  pat = gen_rtx_PLUS (Pmode, gen_rtx_PLUS (Pmode, operands[1], operands[2]),
+
+  operands[0] = gen_lowpart (mode, operands[0]);
+  operands[1] = gen_lowpart (mode, operands[1]);
+  operands[2] = gen_lowpart (mode, operands[2]);
+  operands[3] = gen_lowpart (mode, operands[3]);
+
+  pat = gen_rtx_PLUS (mode, gen_rtx_PLUS (mode, operands[1], operands[2]),
   		      operands[3]);
-  if (Pmode != SImode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 }
   [(set_attr "type" "lea")
    (set_attr "mode" "SI")])
 
-(define_insn_and_split "*lea_general_1_zext"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-	(zero_extend:DI
-	  (plus:SI (plus:SI
-		     (match_operand:SI 1 "index_register_operand" "l")
-		     (match_operand:SI 2 "register_operand" "r"))
-		   (match_operand:SI 3 "immediate_operand" "i"))))]
-  "TARGET_64BIT"
-  "#"
-  "&& reload_completed"
-  [(set (match_dup 0)
-	(zero_extend:DI (subreg:SI (plus:DI (plus:DI (match_dup 1)
-						     (match_dup 2))
-					    (match_dup 3)) 0)))]
-{
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[2] = gen_lowpart (Pmode, operands[2]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-}
-  [(set_attr "type" "lea")
-   (set_attr "mode" "SI")])
-
 (define_insn_and_split "*lea_general_2"
   [(set (match_operand 0 "register_operand" "=r")
 	(plus (mult (match_operand 1 "index_register_operand" "l")
 		    (match_operand 2 "const248_operand" "i"))
 	      (match_operand 3 "nonmemory_operand" "ri")))]
-  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode
-    || (TARGET_64BIT && GET_MODE (operands[0]) == SImode))
-   && (!TARGET_PARTIAL_REG_STALL
-       || GET_MODE (operands[0]) == SImode
-       || optimize_function_for_size_p (cfun))
+  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode)
+   && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
    && GET_MODE (operands[0]) == GET_MODE (operands[1])
    && (GET_MODE (operands[0]) == GET_MODE (operands[3])
        || GET_MODE (operands[3]) == VOIDmode)"
@@ -6306,110 +6292,68 @@ 
   "&& reload_completed"
   [(const_int 0)]
 {
+  enum machine_mode mode = SImode;
   rtx pat;
-  operands[0] = gen_lowpart (SImode, operands[0]);
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-  pat = gen_rtx_PLUS (Pmode, gen_rtx_MULT (Pmode, operands[1], operands[2]),
-  		      operands[3]);
-  if (Pmode != SImode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+
+  operands[0] = gen_lowpart (mode, operands[0]);
+  operands[1] = gen_lowpart (mode, operands[1]);
+  operands[3] = gen_lowpart (mode, operands[3]);
+
+  pat = gen_rtx_PLUS (mode, gen_rtx_MULT (mode, operands[1], operands[2]),
+		      operands[3]);
+
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 }
   [(set_attr "type" "lea")
    (set_attr "mode" "SI")])
 
-(define_insn_and_split "*lea_general_2_zext"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-	(zero_extend:DI
-	  (plus:SI (mult:SI
-		     (match_operand:SI 1 "index_register_operand" "l")
-		     (match_operand:SI 2 "const248_operand" "n"))
-		   (match_operand:SI 3 "nonmemory_operand" "ri"))))]
-  "TARGET_64BIT"
-  "#"
-  "&& reload_completed"
-  [(set (match_dup 0)
-	(zero_extend:DI (subreg:SI (plus:DI (mult:DI (match_dup 1)
-						     (match_dup 2))
-					    (match_dup 3)) 0)))]
-{
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-}
-  [(set_attr "type" "lea")
-   (set_attr "mode" "SI")])
-
 (define_insn_and_split "*lea_general_3"
   [(set (match_operand 0 "register_operand" "=r")
 	(plus (plus (mult (match_operand 1 "index_register_operand" "l")
 			  (match_operand 2 "const248_operand" "i"))
 		    (match_operand 3 "register_operand" "r"))
 	      (match_operand 4 "immediate_operand" "i")))]
-  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode
-    || (TARGET_64BIT && GET_MODE (operands[0]) == SImode))
-   && (!TARGET_PARTIAL_REG_STALL
-       || GET_MODE (operands[0]) == SImode
-       || optimize_function_for_size_p (cfun))
+  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode)
+   && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
    && GET_MODE (operands[0]) == GET_MODE (operands[1])
    && GET_MODE (operands[0]) == GET_MODE (operands[3])"
   "#"
   "&& reload_completed"
   [(const_int 0)]
 {
+  enum machine_mode mode = SImode;
   rtx pat;
-  operands[0] = gen_lowpart (SImode, operands[0]);
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-  operands[4] = gen_lowpart (Pmode, operands[4]);
-  pat = gen_rtx_PLUS (Pmode,
-  		      gen_rtx_PLUS (Pmode, gen_rtx_MULT (Pmode, operands[1],
-		      					 operands[2]),
+
+  operands[0] = gen_lowpart (mode, operands[0]);
+  operands[1] = gen_lowpart (mode, operands[1]);
+  operands[3] = gen_lowpart (mode, operands[3]);
+  operands[4] = gen_lowpart (mode, operands[4]);
+
+  pat = gen_rtx_PLUS (mode,
+  		      gen_rtx_PLUS (mode,
+				    gen_rtx_MULT (mode, operands[1],
+		      					operands[2]),
 				    operands[3]),
   		      operands[4]);
-  if (Pmode != SImode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 }
   [(set_attr "type" "lea")
    (set_attr "mode" "SI")])
 
-(define_insn_and_split "*lea_general_3_zext"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-	(zero_extend:DI
-	  (plus:SI (plus:SI
-		     (mult:SI
-		       (match_operand:SI 1 "index_register_operand" "l")
-		       (match_operand:SI 2 "const248_operand" "n"))
-		     (match_operand:SI 3 "register_operand" "r"))
-		   (match_operand:SI 4 "immediate_operand" "i"))))]
-  "TARGET_64BIT"
-  "#"
-  "&& reload_completed"
-  [(set (match_dup 0)
-	(zero_extend:DI (subreg:SI (plus:DI (plus:DI (mult:DI (match_dup 1)
-							      (match_dup 2))
-						     (match_dup 3))
-					    (match_dup 4)) 0)))]
-{
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-  operands[4] = gen_lowpart (Pmode, operands[4]);
-}
-  [(set_attr "type" "lea")
-   (set_attr "mode" "SI")])
-
 (define_insn_and_split "*lea_general_4"
-  [(set (match_operand:SWI 0 "register_operand" "=r")
-	(any_or:SWI (ashift:SWI (match_operand:SWI 1 "index_register_operand" "l")
-				(match_operand:SWI 2 "const_int_operand" "n"))
-		    (match_operand 3 "const_int_operand" "n")))]
-  "(<MODE>mode == DImode
-    || <MODE>mode == SImode
-    || !TARGET_PARTIAL_REG_STALL
-    || optimize_function_for_size_p (cfun))
+  [(set (match_operand 0 "register_operand" "=r")
+	(any_or (ashift
+		  (match_operand 1 "index_register_operand" "l")
+		  (match_operand 2 "const_int_operand" "n"))
+		(match_operand 3 "const_int_operand" "n")))]
+  "(((GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode)
+      && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)))
+    || GET_MODE (operands[0]) == SImode
+    || (TARGET_64BIT && GET_MODE (operands[0]) == DImode))
+   && GET_MODE (operands[0]) == GET_MODE (operands[1])
    && ((unsigned HOST_WIDE_INT) INTVAL (operands[2])) - 1 < 3
    && ((unsigned HOST_WIDE_INT) INTVAL (operands[3])
        < ((unsigned HOST_WIDE_INT) 1 << INTVAL (operands[2])))"
@@ -6417,23 +6361,29 @@ 
   "&& reload_completed"
   [(const_int 0)]
 {
+  enum machine_mode mode = GET_MODE (operands[0]);
   rtx pat;
-  if (<MODE>mode != DImode)
-    operands[0] = gen_lowpart (SImode, operands[0]);
-  operands[1] = gen_lowpart (Pmode, operands[1]);
+
+  if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (SImode))
+    { 
+      mode = SImode; 
+      operands[0] = gen_lowpart (mode, operands[0]);
+      operands[1] = gen_lowpart (mode, operands[1]);
+    }
+
   operands[2] = GEN_INT (1 << INTVAL (operands[2]));
-  pat = plus_constant (gen_rtx_MULT (Pmode, operands[1], operands[2]),
+
+  pat = plus_constant (gen_rtx_MULT (mode, operands[1], operands[2]),
 		       INTVAL (operands[3]));
-  if (Pmode != SImode && <MODE>mode != DImode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 }
   [(set_attr "type" "lea")
    (set (attr "mode")
-      (if_then_else (eq (symbol_ref "<MODE>mode == DImode") (const_int 0))
-	(const_string "SI")
-	(const_string "DI")))])
+      (if_then_else (match_operand:DI 0 "" "")
+	(const_string "DI")
+	(const_string "SI")))])
 
 ;; Subtract instructions
 
@@ -9395,36 +9345,36 @@ 
        (const_string "*")))
    (set_attr "mode" "QI")])
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert ashift to the lea pattern to avoid flags dependency.
 (define_split
   [(set (match_operand 0 "register_operand" "")
 	(ashift (match_operand 1 "index_register_operand" "")
                 (match_operand:QI 2 "const_int_operand" "")))
    (clobber (reg:CC FLAGS_REG))]
-  "reload_completed
+  "GET_MODE (operands[0]) == GET_MODE (operands[1])
+   && reload_completed
    && true_regnum (operands[0]) != true_regnum (operands[1])"
   [(const_int 0)]
 {
-  rtx pat;
   enum machine_mode mode = GET_MODE (operands[0]);
-
-  if (mode != Pmode)
-    operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[2] = gen_int_mode (1 << INTVAL (operands[2]), Pmode);
-
-  pat = gen_rtx_MULT (Pmode, operands[1], operands[2]);
+  rtx pat;
 
   if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (SImode))
-    operands[0] = gen_lowpart (SImode, operands[0]);
+    { 
+      mode = SImode; 
+      operands[0] = gen_lowpart (mode, operands[0]);
+      operands[1] = gen_lowpart (mode, operands[1]);
+    }
+
+  operands[2] = gen_int_mode (1 << INTVAL (operands[2]), mode);
 
-  if (TARGET_64BIT && mode != Pmode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+  pat = gen_rtx_MULT (mode, operands[1], operands[2]);
 
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 })
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert ashift to the lea pattern to avoid flags dependency.
 (define_split
   [(set (match_operand:DI 0 "register_operand" "")
 	(zero_extend:DI
Index: predicates.md
===================================================================
--- predicates.md	(revision 176733)
+++ predicates.md	(working copy)
@@ -366,9 +366,13 @@ 
 
 ;; Return true when operand is PIC expression that can be computed by lea
 ;; operation.
-(define_predicate "pic_32bit_operand"
+(define_special_predicate "pic_32bit_operand"
   (match_code "const,symbol_ref,label_ref")
 {
+  if (GET_MODE (op) != SImode
+      && GET_MODE (op) != DImode)
+    return false;
+
   if (!flag_pic)
     return false;
   /* Rule out relocations that translate into 64bit constants.  */
@@ -1188,3 +1192,9 @@ 
       return false;
   return true;
 })
+
+;; Test for a valid add operand for SImode.
+(define_predicate "addsi_operand"
+  (if_then_else (match_test "TARGET_X32")
+    (match_operand 0 "x86_64_general_operand")
+    (match_operand 0 "general_operand")))
Index: i386.c
===================================================================
--- i386.c	(revision 176733)
+++ i386.c	(working copy)
@@ -14122,6 +14122,9 @@  ix86_print_operand_address (FILE *file, 
     }
   else
     {
+      /* Use DImode registers in address on 64bit target.  */
+      int code = TARGET_64BIT ? 'q' : 0;
+
       if (ASSEMBLER_DIALECT == ASM_ATT)
 	{
 	  if (disp)
@@ -14136,11 +14139,11 @@  ix86_print_operand_address (FILE *file, 
 
 	  putc ('(', file);
 	  if (base)
-	    print_reg (base, 0, file);
+	    print_reg (base, code, file);
 	  if (index)
 	    {
 	      putc (',', file);
-	      print_reg (index, 0, file);
+	      print_reg (index, code, file);
 	      if (scale != 1)
 		fprintf (file, ",%d", scale);
 	    }
@@ -14175,7 +14178,7 @@  ix86_print_operand_address (FILE *file, 
 	  putc ('[', file);
 	  if (base)
 	    {
-	      print_reg (base, 0, file);
+	      print_reg (base, code, file);
 	      if (offset)
 		{
 		  if (INTVAL (offset) >= 0)
@@ -14191,7 +14194,7 @@  ix86_print_operand_address (FILE *file, 
 	  if (index)
 	    {
 	      putc ('+', file);
-	      print_reg (index, 0, file);
+	      print_reg (index, code, file);
 	      if (scale != 1)
 		fprintf (file, "*%d", scale);
 	    }