Patchwork Patch: Print SImode register names to force addr32 prefix

login
register
mail settings
Submitter H.J. Lu
Date Nov. 9, 2012, 9:17 a.m.
Message ID <20121109091727.GA23790@gmail.com>
Download mbox | patch
Permalink /patch/197997/
State New
Headers show

Comments

H.J. Lu - Nov. 9, 2012, 9:17 a.m.
Hi,

Since x32 runs in 64-bit mode, for address -0x40000300(%rax), hardware
sign-extends displacement from 32-bits to 64-bits and adds it to %rax.
But x32 wants 32-bit -0x40000300, not 64-bit -0x40000300.  This patch
uses 32-bit registers instead of 64-bit registers when displacement 
< -16*1024*1024.  -16*1024*1024 is used instead of 0 so that we will
still generate -16(%rsp) instead of -16(%esp).

Tested it on Linux/x32.  OK to install?

Thanks.


H.J.
---
2012-11-09  H.J. Lu  <hongjiu.lu@intel.com>

	* config/i386/i386.c (ix86_print_operand_address): For x32,
	print SImode register names to force addr32 prefix if displacement
	< -16*1024*1024.
Uros Bizjak - Nov. 9, 2012, 5:23 p.m.
On Fri, Nov 9, 2012 at 10:17 AM, H.J. Lu <hjl.tools@gmail.com> wrote:

> Since x32 runs in 64-bit mode, for address -0x40000300(%rax), hardware
> sign-extends displacement from 32-bits to 64-bits and adds it to %rax.
> But x32 wants 32-bit -0x40000300, not 64-bit -0x40000300.  This patch
> uses 32-bit registers instead of 64-bit registers when displacement
> < -16*1024*1024.  -16*1024*1024 is used instead of 0 so that we will
> still generate -16(%rsp) instead of -16(%esp).
>
> Tested it on Linux/x32.  OK to install?

This problem uncovers a bug in the middle-end, so I guess it would be
better to fix it there.

Uros.
H.J. Lu - Nov. 11, 2012, 9:37 p.m.
On Fri, Nov 9, 2012 at 9:23 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Fri, Nov 9, 2012 at 10:17 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>
>> Since x32 runs in 64-bit mode, for address -0x40000300(%rax), hardware
>> sign-extends displacement from 32-bits to 64-bits and adds it to %rax.
>> But x32 wants 32-bit -0x40000300, not 64-bit -0x40000300.  This patch
>> uses 32-bit registers instead of 64-bit registers when displacement
>> < -16*1024*1024.  -16*1024*1024 is used instead of 0 so that we will
>> still generate -16(%rsp) instead of -16(%esp).
>>
>> Tested it on Linux/x32.  OK to install?
>
> This problem uncovers a bug in the middle-end, so I guess it would be
> better to fix it there.
>
> Uros.

The problem is it isn't safe to transform

(zero_extend:DI (plus:SI (FOO:SI) (const_int Y)))

to

(plus:DI (zero_extend:DI (FOO:SI)) (const_int Y))

when Y is negative and its absolute value is greater than FOO.  However,
we have to do this transformation since other parts of GCC depend on
it.  If we revert the fix for

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49721

we will get

FAIL: gcc.c-torture/compile/990523-1.c  -O3 -g  (internal compiler error)
FAIL: gcc.c-torture/compile/990523-1.c  -O3 -g  (test for excess errors)
FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer -funroll-all-loo
ps -finline-functions  (internal compiler error)
FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer -funroll-all-loo
ps -finline-functions  (test for excess errors)
FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer -funroll-loops
(internal compiler error)
FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer -funroll-loops
(test for excess errors)
FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer  (internal compi
ler error)
FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer  (test for exces
s errors)
FAIL: gcc.c-torture/compile/pr41634.c  -O3 -g  (internal compiler error)
FAIL: gcc.c-torture/compile/pr41634.c  -O3 -g  (test for excess errors)
FAIL: gcc.dg/Warray-bounds.c (internal compiler error)
FAIL: gcc.dg/Warray-bounds.c (test for excess errors)

since we generate pseudo registers to convert SImode to DImode
after reload.  Fixing it requires significant changes.

This is only a problem for 64-bit register address since the symbolic
address is 32-bit.  Using 32-bit base/index registers will work around
this issue.
H.J. Lu - Nov. 13, 2012, 7:15 a.m.
On Sun, Nov 11, 2012 at 1:37 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Fri, Nov 9, 2012 at 9:23 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
>> On Fri, Nov 9, 2012 at 10:17 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>
>>> Since x32 runs in 64-bit mode, for address -0x40000300(%rax), hardware
>>> sign-extends displacement from 32-bits to 64-bits and adds it to %rax.
>>> But x32 wants 32-bit -0x40000300, not 64-bit -0x40000300.  This patch
>>> uses 32-bit registers instead of 64-bit registers when displacement
>>> < -16*1024*1024.  -16*1024*1024 is used instead of 0 so that we will
>>> still generate -16(%rsp) instead of -16(%esp).
>>>
>>> Tested it on Linux/x32.  OK to install?
>>
>> This problem uncovers a bug in the middle-end, so I guess it would be
>> better to fix it there.
>>
>> Uros.
>
> The problem is it isn't safe to transform
>
> (zero_extend:DI (plus:SI (FOO:SI) (const_int Y)))
>
> to
>
> (plus:DI (zero_extend:DI (FOO:SI)) (const_int Y))
>
> when Y is negative and its absolute value is greater than FOO.  However,
> we have to do this transformation since other parts of GCC depend on
> it.  If we revert the fix for
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49721
>
> we will get
>
> FAIL: gcc.c-torture/compile/990523-1.c  -O3 -g  (internal compiler error)
> FAIL: gcc.c-torture/compile/990523-1.c  -O3 -g  (test for excess errors)
> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer -funroll-all-loo
> ps -finline-functions  (internal compiler error)
> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer -funroll-all-loo
> ps -finline-functions  (test for excess errors)
> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer -funroll-loops
> (internal compiler error)
> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer -funroll-loops
> (test for excess errors)
> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer  (internal compi
> ler error)
> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer  (test for exces
> s errors)
> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -g  (internal compiler error)
> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -g  (test for excess errors)
> FAIL: gcc.dg/Warray-bounds.c (internal compiler error)
> FAIL: gcc.dg/Warray-bounds.c (test for excess errors)
>
> since we generate pseudo registers to convert SImode to DImode
> after reload.  Fixing it requires significant changes.
>
> This is only a problem for 64-bit register address since the symbolic
> address is 32-bit.  Using 32-bit base/index registers will work around
> this issue.

This address

(plus:DI (zero_extend:DI (FOO:SI)) (const_int Y))

is OK for x32 as long as Y, which is encoded as 32-bit immediate,
is zero-extend from 32-bit to 64-bit.  SImode address does it.
My patch optimizes it a little bit by using SImode address only
for Y < -16*1024*1024.
Uros Bizjak - Nov. 13, 2012, 7:28 a.m.
On Tue, Nov 13, 2012 at 8:15 AM, H.J. Lu <hjl.tools@gmail.com> wrote:

>>>> Since x32 runs in 64-bit mode, for address -0x40000300(%rax), hardware
>>>> sign-extends displacement from 32-bits to 64-bits and adds it to %rax.
>>>> But x32 wants 32-bit -0x40000300, not 64-bit -0x40000300.  This patch
>>>> uses 32-bit registers instead of 64-bit registers when displacement
>>>> < -16*1024*1024.  -16*1024*1024 is used instead of 0 so that we will
>>>> still generate -16(%rsp) instead of -16(%esp).
>>>>
>>>> Tested it on Linux/x32.  OK to install?
>>>
>>> This problem uncovers a bug in the middle-end, so I guess it would be
>>> better to fix it there.
>>>
>>> Uros.
>>
>> The problem is it isn't safe to transform
>>
>> (zero_extend:DI (plus:SI (FOO:SI) (const_int Y)))
>>
>> to
>>
>> (plus:DI (zero_extend:DI (FOO:SI)) (const_int Y))
>>
>> when Y is negative and its absolute value is greater than FOO.  However,
>> we have to do this transformation since other parts of GCC depend on
>> it.  If we revert the fix for
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49721
>>
>> we will get
>>
>> FAIL: gcc.c-torture/compile/990523-1.c  -O3 -g  (internal compiler error)
>> FAIL: gcc.c-torture/compile/990523-1.c  -O3 -g  (test for excess errors)
>> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer -funroll-all-loo
>> ps -finline-functions  (internal compiler error)
>> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer -funroll-all-loo
>> ps -finline-functions  (test for excess errors)
>> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer -funroll-loops
>> (internal compiler error)
>> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer -funroll-loops
>> (test for excess errors)
>> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer  (internal compi
>> ler error)
>> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer  (test for exces
>> s errors)
>> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -g  (internal compiler error)
>> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -g  (test for excess errors)
>> FAIL: gcc.dg/Warray-bounds.c (internal compiler error)
>> FAIL: gcc.dg/Warray-bounds.c (test for excess errors)
>>
>> since we generate pseudo registers to convert SImode to DImode
>> after reload.  Fixing it requires significant changes.
>>
>> This is only a problem for 64-bit register address since the symbolic
>> address is 32-bit.  Using 32-bit base/index registers will work around
>> this issue.
>
> This address
>
> (plus:DI (zero_extend:DI (FOO:SI)) (const_int Y))
>
> is OK for x32 as long as Y, which is encoded as 32-bit immediate,
> is zero-extend from 32-bit to 64-bit.  SImode address does it.
> My patch optimizes it a little bit by using SImode address only
> for Y < -16*1024*1024.

I was wondering, why we operate with constant -16*1024*1024? Should we
use 0x7FFFFFF instead? Since the MSB is always zero, we are safe.

Please add a fat ??? comment, why we paper-over this issue and repost
the latest patch. I got lost in all the versions :(

Uros.

Patch

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 78e840b..6d4fbb5 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -14502,6 +14514,17 @@  ix86_print_operand_address (FILE *file, rtx addr)
 	  gcc_assert (!code);
 	  code = 'l';
 	}
+      else if (code == 0
+	       && TARGET_X32
+	       && disp
+	       && CONST_INT_P (disp)
+	       && INTVAL (disp) < -16*1024*1024)
+	{
+	  /* For x32, print SImode register names to force addr32 prefix
+	    if displacement < -16*1024*1024 so that 32-bit displacement
+	    isn't sign-extended to 64-bit.  */
+	  code = 'k';
+	}
 
       if (ASSEMBLER_DIALECT == ASM_ATT)
 	{