Patchwork [RFC,i386] : Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on x86_64

login
register
mail settings
Submitter Uros Bizjak
Date Dec. 27, 2012, 8:09 a.m.
Message ID <CAFULd4ZJr6ftPwCqhW_7daH-OZWOc5iytCfEFpYUObTj+be34A@mail.gmail.com>
Download mbox | patch
Permalink /patch/208280/
State New
Headers show

Comments

Uros Bizjak - Dec. 27, 2012, 8:09 a.m.
On Thu, Dec 27, 2012 at 9:08 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Wed, Dec 26, 2012 at 9:16 PM, Andi Kleen <andi@firstfloor.org> wrote:
>>> Can you please post a real-world example, where using %r15 would break
>>> existing code?
>>
>> I used to run into problems like this when porting code to gcc from icc or VC.
>> A lot of hyper optimized inline assembler snippets wants to use all registers
>> and icc/VC support that. With gcc usually had to add some manual
>> push/pops. In older gcc versions usually more than one because reload
>> tended to error out otherwise.
>>
>> So by try and error used the non fixed registers, but let the compiler know
>> about the others. This case would break.
>>
>> In 64bit it was less a problem than on 32bit, but could still happen.
>>
>> Admittedly medium is somewhat obscure and rarely used (and very slow),
>> but someone could be still using it.
>
> The alternative approach is changing cpuid definition in cpuid.h (as
> in attached patch) to preserve %rbx, but we can't detect various code
> model settings there. Since the change depends on the definition of
> __PIC__, we unnecessary preserve %rbx also for -mcmodel=small.
> However, code that involves cpuid is rarely performance critical, so
> perhaps we can live with this tradeoff.
>
> IMO, this patch can be used on 4.7 branch, too.

Now with the patch ...

Uros.
Florian Weimer - Dec. 27, 2012, 9:10 a.m.
* Uros Bizjak:

> +#elif defined(__x86_64__)
> +#define __cpuid(level, a, b, c, d)			\
> +  __asm__ ("xchg{q}\t{%%}rbx, %q1\n\t"			\
> +	   "cpuid\n\t"					\
> +	   "xchg{q}\t{%%}rbx, %q1\n\t"			\
> +	   : "=a" (a), "=r" (b), "=c" (c), "=d" (d)	\
> +	   : "0" (level))
> +
> +#define __cpuid_count(level, count, a, b, c, d)		\
> +  __asm__ ("xchg{q}\t{%%}rbx, %q1\n\t"			\
> +	   "cpuid\n\t"					\
> +	   "xchg{q}\t{%%}rbx, %q1\n\t"			\
> +	   : "=a" (a), "=r" (b), "=c" (c), "=d" (d)	\
> +	   : "0" (level), "2" (count))
> +#endif

Shouldn't the constraint for b be "=&r"?
Uros Bizjak - Dec. 27, 2012, 9:17 a.m.
On Thu, Dec 27, 2012 at 10:10 AM, Florian Weimer <fw@deneb.enyo.de> wrote:
> * Uros Bizjak:
>
>> +#elif defined(__x86_64__)
>> +#define __cpuid(level, a, b, c, d)                   \
>> +  __asm__ ("xchg{q}\t{%%}rbx, %q1\n\t"                       \
>> +        "cpuid\n\t"                                  \
>> +        "xchg{q}\t{%%}rbx, %q1\n\t"                  \
>> +        : "=a" (a), "=r" (b), "=c" (c), "=d" (d)     \
>> +        : "0" (level))
>> +
>> +#define __cpuid_count(level, count, a, b, c, d)              \
>> +  __asm__ ("xchg{q}\t{%%}rbx, %q1\n\t"                       \
>> +        "cpuid\n\t"                                  \
>> +        "xchg{q}\t{%%}rbx, %q1\n\t"                  \
>> +        : "=a" (a), "=r" (b), "=c" (c), "=d" (d)     \
>> +        : "0" (level), "2" (count))
>> +#endif
>
> Shouldn't the constraint for b be "=&r"?

Technically yes, but all input operands are matched to outputs, so in
practice it doesn't really matter.

Uros.

Patch

Index: i386/cpuid.h
===================================================================
--- i386/cpuid.h	(revision 194723)
+++ i386/cpuid.h	(working copy)
@@ -132,8 +132,9 @@ 
 #define signature_VORTEX_ecx	0x436f5320
 #define signature_VORTEX_edx	0x36387865
 
-#if defined(__i386__) && defined(__PIC__)
+#if defined(__PIC__)
 /* %ebx may be the PIC register.  */
+#if defined(__i386__)
 #if __GNUC__ >= 3
 #define __cpuid(level, a, b, c, d)			\
   __asm__ ("xchg{l}\t{%%}ebx, %1\n\t"			\
@@ -165,6 +166,21 @@ 
 	   : "=a" (a), "=r" (b), "=c" (c), "=d" (d)	\
 	   : "0" (level), "2" (count))
 #endif
+#elif defined(__x86_64__)
+#define __cpuid(level, a, b, c, d)			\
+  __asm__ ("xchg{q}\t{%%}rbx, %q1\n\t"			\
+	   "cpuid\n\t"					\
+	   "xchg{q}\t{%%}rbx, %q1\n\t"			\
+	   : "=a" (a), "=r" (b), "=c" (c), "=d" (d)	\
+	   : "0" (level))
+
+#define __cpuid_count(level, count, a, b, c, d)		\
+  __asm__ ("xchg{q}\t{%%}rbx, %q1\n\t"			\
+	   "cpuid\n\t"					\
+	   "xchg{q}\t{%%}rbx, %q1\n\t"			\
+	   : "=a" (a), "=r" (b), "=c" (c), "=d" (d)	\
+	   : "0" (level), "2" (count))
+#endif
 #else
 #define __cpuid(level, a, b, c, d)			\
   __asm__ ("cpuid\n\t"					\