Patchwork [2/5] arm: LLVMLinux: use current_stack_pointer for percpu

login
register
mail settings
Submitter Behan Webster
Date Sept. 6, 2013, 9:28 p.m.
Message ID <1378502899-1241-3-git-send-email-behanw@converseincode.com>
Download mbox | patch
Permalink /patch/273335/
State New
Headers show

Comments

Behan Webster - Sept. 6, 2013, 9:28 p.m.
From: Behan Webster <behanw@converseincode.com>

The existing code uses named registers to get the value of the stack pointer.
The new current_stack_pointer macro is more readable and allows for a central
portable implementation of how to get the stack pointer with ASM.  This change
supports being able to compile the kernel with both gcc and Clang.

Signed-off-by: Mark Charlebois <charlebm@gmail.com>
Signed-off-by: Behan Webster <behanw@converseincode.com>
Reviewed-by: Jan-Simon Möller <dl9pf@gmx.de>
---
 arch/arm/include/asm/percpu.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
Russell King - ARM Linux - Sept. 6, 2013, 10:22 p.m.
On Fri, Sep 06, 2013 at 05:28:08PM -0400, behanw@converseincode.com wrote:
> From: Behan Webster <behanw@converseincode.com>
> 
> The existing code uses named registers to get the value of the stack pointer.
> The new current_stack_pointer macro is more readable and allows for a central
> portable implementation of how to get the stack pointer with ASM.  This change
> supports being able to compile the kernel with both gcc and Clang.
> 
> Signed-off-by: Mark Charlebois <charlebm@gmail.com>
> Signed-off-by: Behan Webster <behanw@converseincode.com>
> Reviewed-by: Jan-Simon Möller <dl9pf@gmx.de>
> ---
>  arch/arm/include/asm/percpu.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/include/asm/percpu.h b/arch/arm/include/asm/percpu.h
> index 209e650..629a975 100644
> --- a/arch/arm/include/asm/percpu.h
> +++ b/arch/arm/include/asm/percpu.h
> @@ -30,14 +30,14 @@ static inline void set_my_cpu_offset(unsigned long off)
>  static inline unsigned long __my_cpu_offset(void)
>  {
>  	unsigned long off;
> -	register unsigned long *sp asm ("sp");
> +	unsigned long sp = current_stack_pointer;
>  
>  	/*
>  	 * Read TPIDRPRW.
>  	 * We want to allow caching the value, so avoid using volatile and
>  	 * instead use a fake stack read to hazard against barrier().
>  	 */
> -	asm("mrc p15, 0, %0, c13, c0, 4" : "=r" (off) : "Q" (*sp));
> +	asm("mrc p15, 0, %0, c13, c0, 4" : "=r" (off) : "Q" (sp));

This looks like it's breaking what's going on here.  With the original
code, we're passing the contents of the word at the stack pointer into
the assembly via a "Q" constraint.  After this change, we're passing
the _value_ of the stack pointer.

Also, if you read the comment, it's certainly wrong.
Måns Rullgård - Sept. 6, 2013, 10:31 p.m.
behanw@converseincode.com writes:

> From: Behan Webster <behanw@converseincode.com>
>
> The existing code uses named registers to get the value of the stack pointer.
> The new current_stack_pointer macro is more readable and allows for a central
> portable implementation of how to get the stack pointer with ASM.  This change
> supports being able to compile the kernel with both gcc and Clang.
>
> Signed-off-by: Mark Charlebois <charlebm@gmail.com>
> Signed-off-by: Behan Webster <behanw@converseincode.com>
> Reviewed-by: Jan-Simon Möller <dl9pf@gmx.de>
> ---
>  arch/arm/include/asm/percpu.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/include/asm/percpu.h b/arch/arm/include/asm/percpu.h
> index 209e650..629a975 100644
> --- a/arch/arm/include/asm/percpu.h
> +++ b/arch/arm/include/asm/percpu.h
> @@ -30,14 +30,14 @@ static inline void set_my_cpu_offset(unsigned long off)
>  static inline unsigned long __my_cpu_offset(void)
>  {
>  	unsigned long off;
> -	register unsigned long *sp asm ("sp");
> +	unsigned long sp = current_stack_pointer;
>
>  	/*
>  	 * Read TPIDRPRW.
>  	 * We want to allow caching the value, so avoid using volatile and
>  	 * instead use a fake stack read to hazard against barrier().
>  	 */
> -	asm("mrc p15, 0, %0, c13, c0, 4" : "=r" (off) : "Q" (*sp));
> +	asm("mrc p15, 0, %0, c13, c0, 4" : "=r" (off) : "Q" (sp));

This doesn't do quite the same thing.  The existing code pretends to
read something from the stack in order to create a barrier of some
sort.  Your new code stores the value of the stack pointer to a location
on the stack for consumption by the "Q" memory constraint.  This store
is not necessary and should preferably be avoided.
Behan Webster - Sept. 6, 2013, 10:56 p.m.
On 09/06/13 18:22, Russell King - ARM Linux wrote:
> On Fri, Sep 06, 2013 at 05:28:08PM -0400, behanw@converseincode.com wrote:
>> From: Behan Webster <behanw@converseincode.com>
>>
>> The existing code uses named registers to get the value of the stack pointer.
>> The new current_stack_pointer macro is more readable and allows for a central
>> portable implementation of how to get the stack pointer with ASM.  This change
>> supports being able to compile the kernel with both gcc and Clang.
>>
>> Signed-off-by: Mark Charlebois <charlebm@gmail.com>
>> Signed-off-by: Behan Webster <behanw@converseincode.com>
>> Reviewed-by: Jan-Simon Möller <dl9pf@gmx.de>
>> ---
>>   arch/arm/include/asm/percpu.h | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/percpu.h b/arch/arm/include/asm/percpu.h
>> index 209e650..629a975 100644
>> --- a/arch/arm/include/asm/percpu.h
>> +++ b/arch/arm/include/asm/percpu.h
>> @@ -30,14 +30,14 @@ static inline void set_my_cpu_offset(unsigned long off)
>>   static inline unsigned long __my_cpu_offset(void)
>>   {
>>   	unsigned long off;
>> -	register unsigned long *sp asm ("sp");
>> +	unsigned long sp = current_stack_pointer;
>>   
>>   	/*
>>   	 * Read TPIDRPRW.
>>   	 * We want to allow caching the value, so avoid using volatile and
>>   	 * instead use a fake stack read to hazard against barrier().
>>   	 */
>> -	asm("mrc p15, 0, %0, c13, c0, 4" : "=r" (off) : "Q" (*sp));
>> +	asm("mrc p15, 0, %0, c13, c0, 4" : "=r" (off) : "Q" (sp));
> This looks like it's breaking what's going on here.  With the original
> code, we're passing the contents of the word at the stack pointer into
> the assembly via a "Q" constraint.  After this change, we're passing
> the _value_ of the stack pointer.
>
> Also, if you read the comment, it's certainly wrong.
This code was rewritten a few times trying to remove the extra copy. I 
think this bug crept in.

Of course you're right. I will fix it.

Thanks,

Behan
Behan Webster - Sept. 6, 2013, 10:59 p.m.
On 09/06/13 18:31, Måns Rullgård wrote:
> behanw@converseincode.com writes:
>
>> From: Behan Webster <behanw@converseincode.com>
>>
>> The existing code uses named registers to get the value of the stack pointer.
>> The new current_stack_pointer macro is more readable and allows for a central
>> portable implementation of how to get the stack pointer with ASM.  This change
>> supports being able to compile the kernel with both gcc and Clang.
>>
>> Signed-off-by: Mark Charlebois <charlebm@gmail.com>
>> Signed-off-by: Behan Webster <behanw@converseincode.com>
>> Reviewed-by: Jan-Simon Möller <dl9pf@gmx.de>
>> ---
>>   arch/arm/include/asm/percpu.h | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/percpu.h b/arch/arm/include/asm/percpu.h
>> index 209e650..629a975 100644
>> --- a/arch/arm/include/asm/percpu.h
>> +++ b/arch/arm/include/asm/percpu.h
>> @@ -30,14 +30,14 @@ static inline void set_my_cpu_offset(unsigned long off)
>>   static inline unsigned long __my_cpu_offset(void)
>>   {
>>   	unsigned long off;
>> -	register unsigned long *sp asm ("sp");
>> +	unsigned long sp = current_stack_pointer;
>>
>>   	/*
>>   	 * Read TPIDRPRW.
>>   	 * We want to allow caching the value, so avoid using volatile and
>>   	 * instead use a fake stack read to hazard against barrier().
>>   	 */
>> -	asm("mrc p15, 0, %0, c13, c0, 4" : "=r" (off) : "Q" (*sp));
>> +	asm("mrc p15, 0, %0, c13, c0, 4" : "=r" (off) : "Q" (sp));
> This doesn't do quite the same thing.  The existing code pretends to
> read something from the stack in order to create a barrier of some
> sort.  Your new code stores the value of the stack pointer to a location
> on the stack for consumption by the "Q" memory constraint.
Agreed. My bug. Will fix.

>    This store is not necessary and should preferably be avoided.
I agree that the extra store should be avoided. I wasn't unable to 
remove it. Can you suggest how?

Thanks,

Behan
Nicolas Pitre - Sept. 7, 2013, 5:12 a.m.
On Fri, 6 Sep 2013, behanw@converseincode.com wrote:

> From: Behan Webster <behanw@converseincode.com>
> 
> The existing code uses named registers to get the value of the stack pointer.
> The new current_stack_pointer macro is more readable and allows for a central
> portable implementation of how to get the stack pointer with ASM.  This change
> supports being able to compile the kernel with both gcc and Clang.
> 
> Signed-off-by: Mark Charlebois <charlebm@gmail.com>
> Signed-off-by: Behan Webster <behanw@converseincode.com>
> Reviewed-by: Jan-Simon Möller <dl9pf@gmx.de>
> ---
>  arch/arm/include/asm/percpu.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/include/asm/percpu.h b/arch/arm/include/asm/percpu.h
> index 209e650..629a975 100644
> --- a/arch/arm/include/asm/percpu.h
> +++ b/arch/arm/include/asm/percpu.h
> @@ -30,14 +30,14 @@ static inline void set_my_cpu_offset(unsigned long off)
>  static inline unsigned long __my_cpu_offset(void)
>  {
>  	unsigned long off;
> -	register unsigned long *sp asm ("sp");
> +	unsigned long sp = current_stack_pointer;
>  
>  	/*
>  	 * Read TPIDRPRW.
>  	 * We want to allow caching the value, so avoid using volatile and
>  	 * instead use a fake stack read to hazard against barrier().
>  	 */
> -	asm("mrc p15, 0, %0, c13, c0, 4" : "=r" (off) : "Q" (*sp));
> +	asm("mrc p15, 0, %0, c13, c0, 4" : "=r" (off) : "Q" (sp));

This change doesn't look to be equivalent.  Previously the *sp implied a 
memory location which doesn't appear to be the case anymore.

this sp trickery was introduced in commit 509eb76ebf97 to solve bad code 
generation (the commit log has the details).  It would be good if Will 
Deacon could confirm that his test case still works fine with your 
change.


Nicolas

Patch

diff --git a/arch/arm/include/asm/percpu.h b/arch/arm/include/asm/percpu.h
index 209e650..629a975 100644
--- a/arch/arm/include/asm/percpu.h
+++ b/arch/arm/include/asm/percpu.h
@@ -30,14 +30,14 @@  static inline void set_my_cpu_offset(unsigned long off)
 static inline unsigned long __my_cpu_offset(void)
 {
 	unsigned long off;
-	register unsigned long *sp asm ("sp");
+	unsigned long sp = current_stack_pointer;
 
 	/*
 	 * Read TPIDRPRW.
 	 * We want to allow caching the value, so avoid using volatile and
 	 * instead use a fake stack read to hazard against barrier().
 	 */
-	asm("mrc p15, 0, %0, c13, c0, 4" : "=r" (off) : "Q" (*sp));
+	asm("mrc p15, 0, %0, c13, c0, 4" : "=r" (off) : "Q" (sp));
 
 	return off;
 }