Patchwork [RFC,i386] : Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on x86_64

login
register
mail settings
Submitter Uros Bizjak
Date Dec. 30, 2012, 7:08 p.m.
Message ID <CAFULd4aC4Q+8d4GfUZRLe83L6RR=aq=t_4xYswV7P8izUF-xzw@mail.gmail.com>
Download mbox | patch
Permalink /patch/208757/
State New
Headers show

Comments

Uros Bizjak - Dec. 30, 2012, 7:08 p.m.
On Fri, Dec 28, 2012 at 9:27 PM, Richard Henderson <rth@redhat.com> wrote:
> On 12/27/2012 12:08 AM, Uros Bizjak wrote:
>> The alternative approach is changing cpuid definition in cpuid.h (as
>> in attached patch) to preserve %rbx, but we can't detect various code
>> model settings there. Since the change depends on the definition of
>> __PIC__, we unnecessary preserve %rbx also for -mcmodel=small.
>
> Certainly we can.  We also control the preprocessor defines.
> All that's needed is that we create one for the code model.

Something like attached?

I have also included all suggestions (earlyclobber and operand prefix
on temporary register).

2012-12-30  Uros Bizjak  <ubizjak@gmail.com>

	PR target/55712
	* config/i386/i386-c.c (ix86_target_macros_internal): Depending on
	selected code model, define __code_mode_small__, __code_model_medium__,
	__code_model_large__, __code_model_32__ or __code_model_kernel__.
	* config/i386/cpuid.h (__cpuid, __cpuid_count) [__i386__]: Prefix
	xchg temporary register with %k.  Declare temporary register as
	early clobbered.
	[__x86_64__]: For medium and large code models, preserve %rbx register.

Tested on x86_64-pc-linux-gnu {,-m32}.

Uros.
Leif Ekblad - Jan. 4, 2013, 9:39 p.m.
I just tried the patch, but it seems to produce some problems for me. The 
other patch which used a 64-bit specific register (r15) instead of rbx was 
easier to adapt to. The problem for me is that syscalls might clobber 
higher-half of all 32-bit registers, and that I cannot use the stack to 
preserve rbx in the call because of the red-zone.

Problem code:

#define RdosUserGateEdiEcxPar0RetEbx(nr, rdi, rcx, size, res) do { \
  register int _id asm("r14") = nr; \
  register typeof(rdi) _rdi asm("rdi") = (rdi); \
  register typeof(rcx) _rcx asm("r8") = (rcx); \
  register int _size asm("r12") = (size); \
  asm volatile ( \
    "syscall\n\t" \
    "jc 1f\n\t" \
    "movzx %%bx,%%rax\n\t" \
    "jmp 2f\n\t" \
    "1: \n\t" \
    "xorq %%rax,%%rax\n\t" \
    "2: \n\t" \
    : "=a" (res) : "r" (_id), "r" (_rdi), "r" (_rcx), "r" (_size) : "rbx", 
"rdx", "rsi" \
  ); \
} while(0);

inline volatile int RdosCreateFile(const char *FileName, int Attrib)
{
    int res;
    int size = strlen(FileName) + 1;
    RdosUserGateEdiEcxPar0RetEbx(usergate_create_file, FileName, Attrib, 
size, res);
    return res;
}

Error log:

$ rdos-gcc test.c -o test.exe
In file included from /usr/local/rdos/include/rdos.h:719:0,
from test.c:1:
/usr/local/rdos/include/rdosgcc.h: In function 'main':
/usr/local/rdos/include/rdosgcc.h:236:5: error: PIC register clobbered by 
'rbx' in 'asm'
RdosUserGateEdiEcxPar0RetEbx(usergate_open_file, FileName, Access, size, 
res);
^

I would prefer to change PIC register to r15 instead, or alternatively make 
this an target-option.

Regards,
Leif Ekblad


----- Original Message ----- 
From: "Uros Bizjak" <ubizjak@gmail.com>
To: "Richard Henderson" <rth@redhat.com>
Cc: "Andi Kleen" <andi@firstfloor.org>; "Mike Frysinger" 
<vapier@gentoo.org>; <gcc-patches@gcc.gnu.org>; "Leif Ekblad" 
<leif@rdos.net>; "Jakub Jelinek" <jakub@redhat.com>; <jh@suse.cz>; "H.J. Lu" 
<hjl.tools@gmail.com>
Sent: Sunday, December 30, 2012 8:08 PM
Subject: Re: [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on 
x86_64


> On Fri, Dec 28, 2012 at 9:27 PM, Richard Henderson <rth@redhat.com> wrote:
>> On 12/27/2012 12:08 AM, Uros Bizjak wrote:
>>> The alternative approach is changing cpuid definition in cpuid.h (as
>>> in attached patch) to preserve %rbx, but we can't detect various code
>>> model settings there. Since the change depends on the definition of
>>> __PIC__, we unnecessary preserve %rbx also for -mcmodel=small.
>>
>> Certainly we can.  We also control the preprocessor defines.
>> All that's needed is that we create one for the code model.
>
> Something like attached?
>
> I have also included all suggestions (earlyclobber and operand prefix
> on temporary register).
>
> 2012-12-30  Uros Bizjak  <ubizjak@gmail.com>
>
> PR target/55712
> * config/i386/i386-c.c (ix86_target_macros_internal): Depending on
> selected code model, define __code_mode_small__, __code_model_medium__,
> __code_model_large__, __code_model_32__ or __code_model_kernel__.
> * config/i386/cpuid.h (__cpuid, __cpuid_count) [__i386__]: Prefix
> xchg temporary register with %k.  Declare temporary register as
> early clobbered.
> [__x86_64__]: For medium and large code models, preserve %rbx register.
>
> Tested on x86_64-pc-linux-gnu {,-m32}.
>
> Uros.
>
Jakub Jelinek - Jan. 4, 2013, 9:42 p.m.
On Fri, Jan 04, 2013 at 10:39:05PM +0100, Leif Ekblad wrote:
> I just tried the patch, but it seems to produce some problems for
> me. The other patch which used a 64-bit specific register (r15)
> instead of rbx was easier to adapt to. The problem for me is that
> syscalls might clobber higher-half of all 32-bit registers, and that
> I cannot use the stack to preserve rbx in the call because of the
> red-zone.

Of course you can use the stack, just subtract the red zone size (plus
whatever you need) from %rsp first, then save to stack, do syscall,
then restore from stack and finally increment %rsp back.

	Jakub
Leif Ekblad - Jan. 4, 2013, 10:31 p.m.
OK. I know I can do that, but I would need to do it for every syscall since 
every syscall can potentially clobber rbx.

Also, it is very strange that it is only one of the inlines the compiler 
complains about. I have another inline (which uses rbx as input), but that 
doesn't generate any error:

#define RdosUserGateEbxEdiEcxParRetEax(nr, rbx, rdi, rcx, res) do { \
  register int _id asm("r14") = nr; \
  register typeof(rbx) _rbx asm("rbx") = (rbx); \
  register typeof(rdi) _rdi asm("rdi") = (rdi); \
  register typeof(rcx) _rcx asm("r8") = (rcx); \
  register int _size asm("r12") = (rcx); \
  asm volatile ( \
    "syscall\n\t" \
    "jnc 1f\n\t" \
    "xorq %%rax,%%rax\n\t" \
    "1: \n\t" \
    : "=a" (res) :  "r" (_id), "r" (_rbx), "r" (_rdi), "r" (_rcx), "r" 
(_size) : "rdx", "rsi" \
  ); \
} while(0);


inline int RdosWriteFile(int Handle, void *Buf, int Size)
{
    int res;
    RdosUserGateEbxEdiEcxParRetEax(usergate_read_file, Handle, Buf, Size, 
res);
    return res;
}

Why is not this inline causing the problem? Might that be because it will 
not use rbx, but instead another register? OTOH, the code seems to work and 
rbx is not assigned to PIC at the point of the call, but to the defined 
parameter.

Regards,
Leif Ekblad


----- Original Message ----- 
From: "Jakub Jelinek" <jakub@redhat.com>
To: "Leif Ekblad" <leif@rdos.net>
Cc: "Uros Bizjak" <ubizjak@gmail.com>; "Richard Henderson" <rth@redhat.com>; 
"Andi Kleen" <andi@firstfloor.org>; "Mike Frysinger" <vapier@gentoo.org>; 
<gcc-patches@gcc.gnu.org>; <jh@suse.cz>; "H.J. Lu" <hjl.tools@gmail.com>
Sent: Friday, January 04, 2013 10:42 PM
Subject: Re: [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on 
x86_64


> On Fri, Jan 04, 2013 at 10:39:05PM +0100, Leif Ekblad wrote:
>> I just tried the patch, but it seems to produce some problems for
>> me. The other patch which used a 64-bit specific register (r15)
>> instead of rbx was easier to adapt to. The problem for me is that
>> syscalls might clobber higher-half of all 32-bit registers, and that
>> I cannot use the stack to preserve rbx in the call because of the
>> red-zone.
>
> Of course you can use the stack, just subtract the red zone size (plus
> whatever you need) from %rsp first, then save to stack, do syscall,
> then restore from stack and finally increment %rsp back.
>
> Jakub

Patch

Index: config/i386/cpuid.h
===================================================================
--- config/i386/cpuid.h	(revision 194757)
+++ config/i386/cpuid.h	(working copy)
@@ -136,35 +136,50 @@ 
 /* %ebx may be the PIC register.  */
 #if __GNUC__ >= 3
 #define __cpuid(level, a, b, c, d)			\
-  __asm__ ("xchg{l}\t{%%}ebx, %1\n\t"			\
+  __asm__ ("xchg{l}\t{%%}ebx, %k1\n\t"			\
 	   "cpuid\n\t"					\
-	   "xchg{l}\t{%%}ebx, %1\n\t"			\
-	   : "=a" (a), "=r" (b), "=c" (c), "=d" (d)	\
+	   "xchg{l}\t{%%}ebx, %k1\n\t"			\
+	   : "=a" (a), "=&r" (b), "=c" (c), "=d" (d)	\
 	   : "0" (level))
 
 #define __cpuid_count(level, count, a, b, c, d)		\
-  __asm__ ("xchg{l}\t{%%}ebx, %1\n\t"			\
+  __asm__ ("xchg{l}\t{%%}ebx, %k1\n\t"			\
 	   "cpuid\n\t"					\
-	   "xchg{l}\t{%%}ebx, %1\n\t"			\
-	   : "=a" (a), "=r" (b), "=c" (c), "=d" (d)	\
+	   "xchg{l}\t{%%}ebx, %k1\n\t"			\
+	   : "=a" (a), "=&r" (b), "=c" (c), "=d" (d)	\
 	   : "0" (level), "2" (count))
 #else
 /* Host GCCs older than 3.0 weren't supporting Intel asm syntax
    nor alternatives in i386 code.  */
 #define __cpuid(level, a, b, c, d)			\
-  __asm__ ("xchgl\t%%ebx, %1\n\t"			\
+  __asm__ ("xchgl\t%%ebx, %k1\n\t"			\
 	   "cpuid\n\t"					\
-	   "xchgl\t%%ebx, %1\n\t"			\
-	   : "=a" (a), "=r" (b), "=c" (c), "=d" (d)	\
+	   "xchgl\t%%ebx, %k1\n\t"			\
+	   : "=a" (a), "=&r" (b), "=c" (c), "=d" (d)	\
 	   : "0" (level))
 
 #define __cpuid_count(level, count, a, b, c, d)		\
-  __asm__ ("xchgl\t%%ebx, %1\n\t"			\
+  __asm__ ("xchgl\t%%ebx, %k1\n\t"			\
 	   "cpuid\n\t"					\
-	   "xchgl\t%%ebx, %1\n\t"			\
-	   : "=a" (a), "=r" (b), "=c" (c), "=d" (d)	\
+	   "xchgl\t%%ebx, %k1\n\t"			\
+	   : "=a" (a), "=&r" (b), "=c" (c), "=d" (d)	\
 	   : "0" (level), "2" (count))
 #endif
+#elif defined(__x86_64__) && (defined(__code_model_medium__) || defined(__code_model_large__)) && defined(__PIC__)
+/* %ebx may be the PIC register.  */
+#define __cpuid(level, a, b, c, d)			\
+  __asm__ ("xchg{q}\t{%%}rbx, %q1\n\t"			\
+	   "cpuid\n\t"					\
+	   "xchg{q}\t{%%}rbx, %q1\n\t"			\
+	   : "=a" (a), "=&r" (b), "=c" (c), "=d" (d)	\
+	   : "0" (level))
+
+#define __cpuid_count(level, count, a, b, c, d)		\
+  __asm__ ("xchg{q}\t{%%}rbx, %q1\n\t"			\
+	   "cpuid\n\t"					\
+	   "xchg{q}\t{%%}rbx, %q1\n\t"			\
+	   : "=a" (a), "=&r" (b), "=c" (c), "=d" (d)	\
+	   : "0" (level), "2" (count))
 #else
 #define __cpuid(level, a, b, c, d)			\
   __asm__ ("cpuid\n\t"					\
Index: config/i386/i386-c.c
===================================================================
--- config/i386/i386-c.c	(revision 194757)
+++ config/i386/i386-c.c	(working copy)
@@ -243,6 +243,30 @@  ix86_target_macros_internal (HOST_WIDE_INT isa_fla
       break;
     }
 
+  switch (ix86_cmodel)
+    {
+    case CM_SMALL:
+    case CM_SMALL_PIC:
+      def_or_undef (parse_in, "__code_model_small__");
+      break;
+    case CM_MEDIUM:
+    case CM_MEDIUM_PIC:
+      def_or_undef (parse_in, "__code_model_medium__");
+      break;
+    case CM_LARGE:
+    case CM_LARGE_PIC:
+      def_or_undef (parse_in, "__code_model_large__");
+      break;
+    case CM_32:
+      def_or_undef (parse_in, "__code_model_32__");
+      break;
+    case CM_KERNEL:
+      def_or_undef (parse_in, "__code_model_kernel__");
+      break;
+    default:
+      ;
+    }
+
   if (isa_flag & OPTION_MASK_ISA_MMX)
     def_or_undef (parse_in, "__MMX__");
   if (isa_flag & OPTION_MASK_ISA_3DNOW)