Patchwork ARM: Clear icache when creating a closure

login
register
mail settings
Submitter Andrew Haley
Date July 11, 2011, 4:23 p.m.
Message ID <4E1B2384.5080001@redhat.com>
Download mbox | patch
Permalink /patch/104245/
State New
Headers show

Comments

Andrew Haley - July 11, 2011, 4:23 p.m.
On a multicore ARM, you really do have to clear both caches, not just the
dcache.  This bug may exist in other ports too.

Andrew.


2011-07-11  Andrew Haley  <aph@redhat.com>

        * src/arm/ffi.c (FFI_INIT_TRAMPOLINE): Clear icache.
Richard Earnshaw - July 12, 2011, 9:12 a.m.
On 11/07/11 17:23, Andrew Haley wrote:
> On a multicore ARM, you really do have to clear both caches, not just the
> dcache.  This bug may exist in other ports too.
> 
> Andrew.
> 
> 
> 2011-07-11  Andrew Haley  <aph@redhat.com>
> 
>         * src/arm/ffi.c (FFI_INIT_TRAMPOLINE): Clear icache.
> 
> diff --git a/src/arm/ffi.c b/src/arm/ffi.c
> index 885a9cb..b2e7667 100644
> --- a/src/arm/ffi.c
> +++ b/src/arm/ffi.c
> @@ -558,12 +558,16 @@ ffi_closure_free (void *ptr)
>  ({ unsigned char *__tramp = (unsigned char*)(TRAMP);                   \
>     unsigned int  __fun = (unsigned int)(FUN);                          \
>     unsigned int  __ctx = (unsigned int)(CTX);                          \
> +   unsigned char *insns = (unsigned char *)(CTX);                       \
>     *(unsigned int*) &__tramp[0] = 0xe92d000f; /* stmfd sp!, {r0-r3} */ \
>     *(unsigned int*) &__tramp[4] = 0xe59f0000; /* ldr r0, [pc] */       \
>     *(unsigned int*) &__tramp[8] = 0xe59ff000; /* ldr pc, [pc] */       \
>     *(unsigned int*) &__tramp[12] = __ctx;                              \
>     *(unsigned int*) &__tramp[16] = __fun;                              \
> -   __clear_cache((&__tramp[0]), (&__tramp[19]));                       \
> +   __clear_cache((&__tramp[0]), (&__tramp[19])); /* Clear data mapping.  */ \
> +   __clear_cache(insns, insns + 3 * sizeof (unsigned int));             \
> +                                                 /* Clear instruction   \
> +                                                    mapping.  */        \
>   })
> 
>  #endif
> 
> 


Your patch looks sane, but I'll observe here that the poking of
instruction values is wrong on cores that run in BE-8 mode (where
instructions are always little-endian).

R.
Andrew Haley - July 12, 2011, 9:15 a.m.
On 12/07/11 10:12, Richard Earnshaw wrote:
> On 11/07/11 17:23, Andrew Haley wrote:
>> On a multicore ARM, you really do have to clear both caches, not just the
>> dcache.  This bug may exist in other ports too.
>>
>> Andrew.
>>
>>
>> 2011-07-11  Andrew Haley  <aph@redhat.com>
>>
>>         * src/arm/ffi.c (FFI_INIT_TRAMPOLINE): Clear icache.
>>
>> diff --git a/src/arm/ffi.c b/src/arm/ffi.c
>> index 885a9cb..b2e7667 100644
>> --- a/src/arm/ffi.c
>> +++ b/src/arm/ffi.c
>> @@ -558,12 +558,16 @@ ffi_closure_free (void *ptr)
>>  ({ unsigned char *__tramp = (unsigned char*)(TRAMP);                   \
>>     unsigned int  __fun = (unsigned int)(FUN);                          \
>>     unsigned int  __ctx = (unsigned int)(CTX);                          \
>> +   unsigned char *insns = (unsigned char *)(CTX);                       \
>>     *(unsigned int*) &__tramp[0] = 0xe92d000f; /* stmfd sp!, {r0-r3} */ \
>>     *(unsigned int*) &__tramp[4] = 0xe59f0000; /* ldr r0, [pc] */       \
>>     *(unsigned int*) &__tramp[8] = 0xe59ff000; /* ldr pc, [pc] */       \
>>     *(unsigned int*) &__tramp[12] = __ctx;                              \
>>     *(unsigned int*) &__tramp[16] = __fun;                              \
>> -   __clear_cache((&__tramp[0]), (&__tramp[19]));                       \
>> +   __clear_cache((&__tramp[0]), (&__tramp[19])); /* Clear data mapping.  */ \
>> +   __clear_cache(insns, insns + 3 * sizeof (unsigned int));             \
>> +                                                 /* Clear instruction   \
>> +                                                    mapping.  */        \
>>   })
>>
>>  #endif
>>
>>
> 
> 
> Your patch looks sane, but I'll observe here that the poking of
> instruction values is wrong on cores that run in BE-8 mode (where
> instructions are always little-endian).

Oh dear.  How would one test for BE-8 mode on a Linux system?

Thanks,
Andrew.
Richard Earnshaw - July 12, 2011, 9:59 a.m.
On 12/07/11 10:15, Andrew Haley wrote:
> On 12/07/11 10:12, Richard Earnshaw wrote:
>> On 11/07/11 17:23, Andrew Haley wrote:
>>> On a multicore ARM, you really do have to clear both caches, not just the
>>> dcache.  This bug may exist in other ports too.
>>>
>>> Andrew.
>>>
>>>
>>> 2011-07-11  Andrew Haley  <aph@redhat.com>
>>>
>>>         * src/arm/ffi.c (FFI_INIT_TRAMPOLINE): Clear icache.
>>>
>>> diff --git a/src/arm/ffi.c b/src/arm/ffi.c
>>> index 885a9cb..b2e7667 100644
>>> --- a/src/arm/ffi.c
>>> +++ b/src/arm/ffi.c
>>> @@ -558,12 +558,16 @@ ffi_closure_free (void *ptr)
>>>  ({ unsigned char *__tramp = (unsigned char*)(TRAMP);                   \
>>>     unsigned int  __fun = (unsigned int)(FUN);                          \
>>>     unsigned int  __ctx = (unsigned int)(CTX);                          \
>>> +   unsigned char *insns = (unsigned char *)(CTX);                       \
>>>     *(unsigned int*) &__tramp[0] = 0xe92d000f; /* stmfd sp!, {r0-r3} */ \
>>>     *(unsigned int*) &__tramp[4] = 0xe59f0000; /* ldr r0, [pc] */       \
>>>     *(unsigned int*) &__tramp[8] = 0xe59ff000; /* ldr pc, [pc] */       \
>>>     *(unsigned int*) &__tramp[12] = __ctx;                              \
>>>     *(unsigned int*) &__tramp[16] = __fun;                              \
>>> -   __clear_cache((&__tramp[0]), (&__tramp[19]));                       \
>>> +   __clear_cache((&__tramp[0]), (&__tramp[19])); /* Clear data mapping.  */ \
>>> +   __clear_cache(insns, insns + 3 * sizeof (unsigned int));             \
>>> +                                                 /* Clear instruction   \
>>> +                                                    mapping.  */        \
>>>   })
>>>
>>>  #endif
>>>
>>>
>>
>>
>> Your patch looks sane, but I'll observe here that the poking of
>> instruction values is wrong on cores that run in BE-8 mode (where
>> instructions are always little-endian).
> 
> Oh dear.  How would one test for BE-8 mode on a Linux system?
> 
> Thanks,
> Andrew.
> 
> 

Essentially v6 or later and big-endian.  It is possible to run some v6
(but no v7) cores in be-32 mode, but you can't then have unaligned
access support.

To know the configuration for sure, you need to read the SCTLR register
(in CP15 space), but that's not available in user-mode.

R.
Joseph S. Myers - July 21, 2011, 3:33 p.m.
On Tue, 12 Jul 2011, Andrew Haley wrote:

> >>     *(unsigned int*) &__tramp[0] = 0xe92d000f; /* stmfd sp!, {r0-r3} */ \
> >>     *(unsigned int*) &__tramp[4] = 0xe59f0000; /* ldr r0, [pc] */       \
> >>     *(unsigned int*) &__tramp[8] = 0xe59ff000; /* ldr pc, [pc] */       \

> > Your patch looks sane, but I'll observe here that the poking of
> > instruction values is wrong on cores that run in BE-8 mode (where
> > instructions are always little-endian).
> 
> Oh dear.  How would one test for BE-8 mode on a Linux system?

My suggestion would be putting the instruction sequence in a .s file, 
rather than hardcoding the instruction encodings here, and writing the 
code to read from the sequence as assembled by the assembler.  That way it 
will have the appropriate mapping symbols to mark it as ARM-mode code and 
the linker will deal with adjusting endianness, so you don't need to test 
for BE-8 at all.
Andrew Haley - July 25, 2011, 9:33 a.m.
On 21/07/11 16:33, Joseph S. Myers wrote:
> On Tue, 12 Jul 2011, Andrew Haley wrote:
> 
>>>>     *(unsigned int*) &__tramp[0] = 0xe92d000f; /* stmfd sp!, {r0-r3} */ \
>>>>     *(unsigned int*) &__tramp[4] = 0xe59f0000; /* ldr r0, [pc] */       \
>>>>     *(unsigned int*) &__tramp[8] = 0xe59ff000; /* ldr pc, [pc] */       \
> 
>>> Your patch looks sane, but I'll observe here that the poking of
>>> instruction values is wrong on cores that run in BE-8 mode (where
>>> instructions are always little-endian).
>>
>> Oh dear.  How would one test for BE-8 mode on a Linux system?
> 
> My suggestion would be putting the instruction sequence in a .s file, 
> rather than hardcoding the instruction encodings here, and writing the 
> code to read from the sequence as assembled by the assembler.  That way it 
> will have the appropriate mapping symbols to mark it as ARM-mode code and 
> the linker will deal with adjusting endianness, so you don't need to test 
> for BE-8 at all.

OK, I'll have a look at doing that.

Andrew.

Patch

diff --git a/src/arm/ffi.c b/src/arm/ffi.c
index 885a9cb..b2e7667 100644
--- a/src/arm/ffi.c
+++ b/src/arm/ffi.c
@@ -558,12 +558,16 @@  ffi_closure_free (void *ptr)
 ({ unsigned char *__tramp = (unsigned char*)(TRAMP);                   \
    unsigned int  __fun = (unsigned int)(FUN);                          \
    unsigned int  __ctx = (unsigned int)(CTX);                          \
+   unsigned char *insns = (unsigned char *)(CTX);                       \
    *(unsigned int*) &__tramp[0] = 0xe92d000f; /* stmfd sp!, {r0-r3} */ \
    *(unsigned int*) &__tramp[4] = 0xe59f0000; /* ldr r0, [pc] */       \
    *(unsigned int*) &__tramp[8] = 0xe59ff000; /* ldr pc, [pc] */       \
    *(unsigned int*) &__tramp[12] = __ctx;                              \
    *(unsigned int*) &__tramp[16] = __fun;                              \
-   __clear_cache((&__tramp[0]), (&__tramp[19]));                       \
+   __clear_cache((&__tramp[0]), (&__tramp[19])); /* Clear data mapping.  */ \
+   __clear_cache(insns, insns + 3 * sizeof (unsigned int));             \
+                                                 /* Clear instruction   \
+                                                    mapping.  */        \
  })

 #endif