diff mbox

[v2,1/5,ppc] Process dynamic relocations for kernel

Message ID 20111109120303.51ac3b1b@suzukikp.in.ibm.com (mailing list archive)
State Superseded
Headers show

Commit Message

Suzuki Poulose Nov. 9, 2011, 6:33 a.m. UTC
On Tue, 08 Nov 2011 10:19:05 -0600
Josh Poimboeuf <jpoimboe@linux.vnet.ibm.com> wrote:

> On Tue, 2011-11-08 at 12:41 +0530, Suzuki Poulose wrote:
> > What I was suggesting is, instead of flushing the cache in
> > relocate(), lets do it like:
> > 
> > for e.g, on 440x, (in head_44x.S :)
> > 
> > #ifdef CONFIG_RELOCATABLE
> > 	...
> > 	bl relocate
> > 
> > 	#Flush the d-cache and invalidate the i-cache here
> > #endif
> > 
> > 
> > This would let the different platforms do the the cache
> > invalidation in their own way.
> > 
> > Btw, I didn't find an instruction to flush the entire d-cache in
> > PPC440 manual. We have instructions to flush only a block
> > corresponding to an address.
> > 
> > However, we have 'iccci' which would invalidate the entire i-cache
> > which, which I think is better than 80,000 i-cache invalidates.
> 
> In misc_32.S there are already some platform-independent cache
> management functions.  If we use those, then relocate() could simply
> call them.  Then the different platforms calling relocate() wouldn't
> have to worry about flushing/invalidating caches.
> 
> For example, there's a clean_dcache_range() function.  Given any range
> twice the size of the d-cache, it should flush the entire d-cache.
> But the only drawback is that it would require the caller to know the
> size of the d-cache.
> 
> Instead, I think it would be preferable to create a new clean_dcache()
> (or clean_dcache_all()?) function in misc_32.S, which could call
> clean_dcache_range() with the appropriate args for flushing the entire
> d-cache.  relocate() could then call the platform-independent
> clean_dcache().
> 


How about using clean_dcache_range() to flush the range runtime
address range [ _stext, _end ] ? That would flush the entire
instructions.


> For i-cache invalidation there's already the (incorrectly named?)
> flush_instruction_cache().  It uses the appropriate platform-specific
> methods (e.g. iccci for 44x) to invalidate the entire i-cache.

Agreed. The only thing that worries me is the use of KERNELBASE in the 
flush_instruction_cache() for CONFIG_4xx. Can we safely assume all 4xx
implementations ignore the arguments passed to iccci ? 
> 
> Suzuki, if you agree with this direction, I could work up a new patch
> if needed.
> 

I have the following (untested) patch which uses clean_dcache_range()
and flush_instruction_cache(): I will send the next version soon
with those changes and the DYNAMIC_MEMSTART config for oldstyle
relocatoin, if every one agrees to this.


 	add	r11, r11, r12	/* runtime address of .dynamic
section */ @@ -87,18 +86,19 @@ eodyn:				/*
End of Dyn Table scan */
 	 * Work out the current offset from the link time address
of .rela
 	 * section.
 	 *  cur_offset[r7] = rela.run[r9] - rela.link [r7]
-	 *  _stext.link[r10] = _stext.run[r10] - cur_offset[r7]
-	 *  final_offset[r3] = _stext.final[r3] - _stext.link[r10]
+	 *  _stext.link[r12] = _stext.run[r10] - cur_offset[r7]
+	 *  final_offset[r3] = _stext.final[r3] - _stext.link[r12]
 	 */
 	subf	r7, r7, r9	/* cur_offset */
-	subf	r10, r7, r10
-	subf	r3, r10, r3	/* final_offset */
+	subf	r12, r7, r10
+	subf	r3, r12, r3	/* final_offset */
 
 	subf	r8, r6, r8	/* relaz -= relaent */
 	/*
 	 * Scan through the .rela table and process each entry
 	 * r9	- points to the current .rela table entry
 	 * r13	- points to the symbol table
+	 * r10  - holds the runtime address of _stext
 	 */
 
 	/*
@@ -180,12 +180,23 @@ store_half:
 
 nxtrela:
 	cmpwi	r8, 0		/* relasz = 0 ? */
-	ble	done
+	ble	flush
 	add	r9, r9, r6	/* move to next entry in
the .rela table */ subf	r8, r6, r8	/* relasz -= relaent */
 	b	applyrela
 
-done:	blr
+	/* Flush the d-cache'd instructions */
+flush:
+	mr	r3, r10
+	lis	r4, (_end - _stext)@h
+	ori	r4, r4, (_end - _stext)@l
+	add	r4, r3, r4
+	bl	clean_dcache_range
+	/* Invalidate the i-cache */
+	bl	flush_instruction_cache
+done:
+	mtlr	r14
+	blr
 
 
 p_dyn:		.long	__dynamic_start - 0b


Thanks

Suzuki

Comments

Suzuki Poulose Nov. 9, 2011, 8:42 a.m. UTC | #1
On 11/09/11 12:03, Suzuki Poulose wrote:
> On Tue, 08 Nov 2011 10:19:05 -0600
> Josh Poimboeuf<jpoimboe@linux.vnet.ibm.com>  wrote:
>
>> On Tue, 2011-11-08 at 12:41 +0530, Suzuki Poulose wrote:
>>> What I was suggesting is, instead of flushing the cache in
>>> relocate(), lets do it like:
>>>
>>> for e.g, on 440x, (in head_44x.S :)
>>>
>>> #ifdef CONFIG_RELOCATABLE
>>> 	...
>>> 	bl relocate
>>>
>>> 	#Flush the d-cache and invalidate the i-cache here
>>> #endif
>>>
>>>
>>> This would let the different platforms do the the cache
>>> invalidation in their own way.
>>>
>>> Btw, I didn't find an instruction to flush the entire d-cache in
>>> PPC440 manual. We have instructions to flush only a block
>>> corresponding to an address.
>>>
>>> However, we have 'iccci' which would invalidate the entire i-cache
>>> which, which I think is better than 80,000 i-cache invalidates.
>>
>> In misc_32.S there are already some platform-independent cache
>> management functions.  If we use those, then relocate() could simply
>> call them.  Then the different platforms calling relocate() wouldn't
>> have to worry about flushing/invalidating caches.
>>
>> For example, there's a clean_dcache_range() function.  Given any range
>> twice the size of the d-cache, it should flush the entire d-cache.
>> But the only drawback is that it would require the caller to know the
>> size of the d-cache.
>>
>> Instead, I think it would be preferable to create a new clean_dcache()
>> (or clean_dcache_all()?) function in misc_32.S, which could call
>> clean_dcache_range() with the appropriate args for flushing the entire
>> d-cache.  relocate() could then call the platform-independent
>> clean_dcache().
>>
>
>
> How about using clean_dcache_range() to flush the range runtime
> address range [ _stext, _end ] ? That would flush the entire
> instructions.
>
>
>> For i-cache invalidation there's already the (incorrectly named?)
>> flush_instruction_cache().  It uses the appropriate platform-specific
>> methods (e.g. iccci for 44x) to invalidate the entire i-cache.
>
> Agreed. The only thing that worries me is the use of KERNELBASE in the
> flush_instruction_cache() for CONFIG_4xx. Can we safely assume all 4xx
> implementations ignore the arguments passed to iccci ?
>>
>> Suzuki, if you agree with this direction, I could work up a new patch
>> if needed.
>>
>
> I have the following (untested) patch which uses clean_dcache_range()
> and flush_instruction_cache(): I will send the next version soon
> with those changes and the DYNAMIC_MEMSTART config for oldstyle
> relocatoin, if every one agrees to this.
>
>
> diff --git a/arch/powerpc/kernel/reloc_32.S
> b/arch/powerpc/kernel/reloc_32.S index 045d61e..cce0510 100644
> --- a/arch/powerpc/kernel/reloc_32.S
> +++ b/arch/powerpc/kernel/reloc_32.S
> @@ -33,10 +33,9 @@ R_PPC_RELATIVE = 22
>
>   _GLOBAL(relocate)
>
> -	mflr	r0
> +	mflr	r14		/* Save our LR */
>   	bl	0f		/* Find our current runtime
> address */ 0:	mflr	r12		/* Make it
> accessible */
> -	mtlr	r0
>
>   	lwz	r11, (p_dyn - 0b)(r12)
>   	add	r11, r11, r12	/* runtime address of .dynamic
> section */ @@ -87,18 +86,19 @@ eodyn:				/*
> End of Dyn Table scan */
>   	 * Work out the current offset from the link time address
> of .rela
>   	 * section.
>   	 *  cur_offset[r7] = rela.run[r9] - rela.link [r7]
> -	 *  _stext.link[r10] = _stext.run[r10] - cur_offset[r7]
> -	 *  final_offset[r3] = _stext.final[r3] - _stext.link[r10]
> +	 *  _stext.link[r12] = _stext.run[r10] - cur_offset[r7]
> +	 *  final_offset[r3] = _stext.final[r3] - _stext.link[r12]
>   	 */
>   	subf	r7, r7, r9	/* cur_offset */
> -	subf	r10, r7, r10
> -	subf	r3, r10, r3	/* final_offset */
> +	subf	r12, r7, r10
> +	subf	r3, r12, r3	/* final_offset */
>
>   	subf	r8, r6, r8	/* relaz -= relaent */
>   	/*
>   	 * Scan through the .rela table and process each entry
>   	 * r9	- points to the current .rela table entry
>   	 * r13	- points to the symbol table
> +	 * r10  - holds the runtime address of _stext
>   	 */
>
>   	/*
> @@ -180,12 +180,23 @@ store_half:
>
>   nxtrela:
>   	cmpwi	r8, 0		/* relasz = 0 ? */
> -	ble	done
> +	ble	flush
>   	add	r9, r9, r6	/* move to next entry in
> the .rela table */ subf	r8, r6, r8	/* relasz -= relaent */
>   	b	applyrela
>
> -done:	blr
> +	/* Flush the d-cache'd instructions */
> +flush:
> +	mr	r3, r10
> +	lis	r4, (_end - _stext)@h
> +	ori	r4, r4, (_end - _stext)@l

Err ! This doesn't compile :

arch/powerpc/kernel/reloc_32.S: Assembler messages:
arch/powerpc/kernel/reloc_32.S:191: Error: can't resolve `_end' {*UND* section} - `_stext' {*UND* section}


I will fix it, but the idea remains the same.

Thanks

Suzuki
Josh Poimboeuf Nov. 9, 2011, 2:53 p.m. UTC | #2
On Wed, 2011-11-09 at 12:03 +0530, Suzuki Poulose wrote:
> On Tue, 08 Nov 2011 10:19:05 -0600
> Josh Poimboeuf <jpoimboe@linux.vnet.ibm.com> wrote:
> 
> > On Tue, 2011-11-08 at 12:41 +0530, Suzuki Poulose wrote:
> > > What I was suggesting is, instead of flushing the cache in
> > > relocate(), lets do it like:
> > > 
> > > for e.g, on 440x, (in head_44x.S :)
> > > 
> > > #ifdef CONFIG_RELOCATABLE
> > > 	...
> > > 	bl relocate
> > > 
> > > 	#Flush the d-cache and invalidate the i-cache here
> > > #endif
> > > 
> > > 
> > > This would let the different platforms do the the cache
> > > invalidation in their own way.
> > > 
> > > Btw, I didn't find an instruction to flush the entire d-cache in
> > > PPC440 manual. We have instructions to flush only a block
> > > corresponding to an address.
> > > 
> > > However, we have 'iccci' which would invalidate the entire i-cache
> > > which, which I think is better than 80,000 i-cache invalidates.
> > 
> > In misc_32.S there are already some platform-independent cache
> > management functions.  If we use those, then relocate() could simply
> > call them.  Then the different platforms calling relocate() wouldn't
> > have to worry about flushing/invalidating caches.
> > 
> > For example, there's a clean_dcache_range() function.  Given any range
> > twice the size of the d-cache, it should flush the entire d-cache.
> > But the only drawback is that it would require the caller to know the
> > size of the d-cache.
> > 
> > Instead, I think it would be preferable to create a new clean_dcache()
> > (or clean_dcache_all()?) function in misc_32.S, which could call
> > clean_dcache_range() with the appropriate args for flushing the entire
> > d-cache.  relocate() could then call the platform-independent
> > clean_dcache().
> > 
> 
> 
> How about using clean_dcache_range() to flush the range runtime
> address range [ _stext, _end ] ? That would flush the entire
> instructions.

Wouldn't that result in more cache flushing than the original solution? 

For example, my kernel is 3.5MB.  Assuming a 32 byte cache line size,
clean_dcache_range(_stext, _end) would result in about 115,000 dcbst's
(3.5MB / 32).


> 
> 
> > For i-cache invalidation there's already the (incorrectly named?)
> > flush_instruction_cache().  It uses the appropriate platform-specific
> > methods (e.g. iccci for 44x) to invalidate the entire i-cache.
> 
> Agreed. The only thing that worries me is the use of KERNELBASE in the 
> flush_instruction_cache() for CONFIG_4xx. Can we safely assume all 4xx
> implementations ignore the arguments passed to iccci ? 

Good question.  I don't know the answer. :-)

That also may suggest a bigger can of worms.  A grep of the powerpc code
shows many uses of KERNELBASE.  For a relocatable kernel, nobody should
be relying on KERNELBASE except for the early relocation code.  Are we
sure that all the other usages of KERNELBASE are "safe"?

For example -- the DEBUG_CRIT_EXCEPTION macro, which head_44x.S uses,
relies on KERNELBASE.  That doesn't seem right to me.


Josh
Suzuki Poulose Nov. 10, 2011, 2:31 a.m. UTC | #3
On 11/09/11 20:23, Josh Poimboeuf wrote:
> On Wed, 2011-11-09 at 12:03 +0530, Suzuki Poulose wrote:
>> On Tue, 08 Nov 2011 10:19:05 -0600
>> Josh Poimboeuf<jpoimboe@linux.vnet.ibm.com>  wrote:
>>
>>> On Tue, 2011-11-08 at 12:41 +0530, Suzuki Poulose wrote:
>>>> What I was suggesting is, instead of flushing the cache in
>>>> relocate(), lets do it like:
>>>>
>>>> for e.g, on 440x, (in head_44x.S :)
>>>>
>>>> #ifdef CONFIG_RELOCATABLE
>>>> 	...
>>>> 	bl relocate
>>>>
>>>> 	#Flush the d-cache and invalidate the i-cache here
>>>> #endif
>>>>
>>>>
>>>> This would let the different platforms do the the cache
>>>> invalidation in their own way.
>>>>
>>>> Btw, I didn't find an instruction to flush the entire d-cache in
>>>> PPC440 manual. We have instructions to flush only a block
>>>> corresponding to an address.
>>>>
>>>> However, we have 'iccci' which would invalidate the entire i-cache
>>>> which, which I think is better than 80,000 i-cache invalidates.
>>>
>>> In misc_32.S there are already some platform-independent cache
>>> management functions.  If we use those, then relocate() could simply
>>> call them.  Then the different platforms calling relocate() wouldn't
>>> have to worry about flushing/invalidating caches.
>>>
>>> For example, there's a clean_dcache_range() function.  Given any range
>>> twice the size of the d-cache, it should flush the entire d-cache.
>>> But the only drawback is that it would require the caller to know the
>>> size of the d-cache.
>>>
>>> Instead, I think it would be preferable to create a new clean_dcache()
>>> (or clean_dcache_all()?) function in misc_32.S, which could call
>>> clean_dcache_range() with the appropriate args for flushing the entire
>>> d-cache.  relocate() could then call the platform-independent
>>> clean_dcache().
>>>
>>
>>
>> How about using clean_dcache_range() to flush the range runtime
>> address range [ _stext, _end ] ? That would flush the entire
>> instructions.
>
> Wouldn't that result in more cache flushing than the original solution?
>
> For example, my kernel is 3.5MB.  Assuming a 32 byte cache line size,
> clean_dcache_range(_stext, _end) would result in about 115,000 dcbst's
> (3.5MB / 32).

Oops ! You are right. We could go back to the clean_dcache_all() or the
initial approach that you suggested. (dcbst).

I am not sure how do we flush the entire dcache(only). Could you post a
patch which does the same ?

Another option is to, change the current mapping to 'Write Through' before
calling the relocate() and revert back to the original setting after relocate().

>
>
>>
>>
>>> For i-cache invalidation there's already the (incorrectly named?)
>>> flush_instruction_cache().  It uses the appropriate platform-specific
>>> methods (e.g. iccci for 44x) to invalidate the entire i-cache.
>>
>> Agreed. The only thing that worries me is the use of KERNELBASE in the
>> flush_instruction_cache() for CONFIG_4xx. Can we safely assume all 4xx
>> implementations ignore the arguments passed to iccci ?
>
> Good question.  I don't know the answer. :-)
>
> That also may suggest a bigger can of worms.  A grep of the powerpc code
> shows many uses of KERNELBASE.  For a relocatable kernel, nobody should
> be relying on KERNELBASE except for the early relocation code.  Are we
> sure that all the other usages of KERNELBASE are "safe"?
>
I think we could simply replace the occurrences of KERNELBASE (after the relocate())
  with '_stext' which would give the virtual start address of the kernel.

Thanks
Suzuki
David Laight Nov. 10, 2011, 9:15 a.m. UTC | #4
> >> How about using clean_dcache_range() to flush the range runtime
> >> address range [ _stext, _end ] ? That would flush the entire
> >> instructions.
> >
> > Wouldn't that result in more cache flushing than the original
solution?
> >
> > For example, my kernel is 3.5MB.  Assuming a 32 byte cache line
size,
> > clean_dcache_range(_stext, _end) would result in about 115,000
dcbst's
> > (3.5MB / 32).
> 
> Oops ! You are right. We could go back to the 
> clean_dcache_all() or the
> initial approach that you suggested. (dcbst).

Maybe clean_dcache_range() itself should limit the range
to the size of the cache?
I suspect that code can easily know the cache size.

	David
Josh Poimboeuf Nov. 10, 2011, 9:44 p.m. UTC | #5
On Thu, 2011-11-10 at 08:01 +0530, Suzuki Poulose wrote:
> >> How about using clean_dcache_range() to flush the range runtime
> >> address range [ _stext, _end ] ? That would flush the entire
> >> instructions.
> >
> > Wouldn't that result in more cache flushing than the original solution?
> >
> > For example, my kernel is 3.5MB.  Assuming a 32 byte cache line size,
> > clean_dcache_range(_stext, _end) would result in about 115,000 dcbst's
> > (3.5MB / 32).
> 
> Oops ! You are right. We could go back to the clean_dcache_all() or the
> initial approach that you suggested. (dcbst).
> 
> I am not sure how do we flush the entire dcache(only). Could you post a
> patch which does the same ?

It turns out that my original idea of giving a 64k address range to
clean_dcache_range() wouldn't work, because dcbst only flushes the line
if the given address is in the cache already.


Also, I experimented with a simple clean_dcache_all() function in
misc_32.S:

#define L1_CACHE_SIZE   (32 * 1024)
#define L1_CACHE_LINES  (L1_CACHE_SIZE / L1_CACHE_BYTES) 
_GLOBAL(clean_dcache_all)
        lis     r3, _start@h
        ori     r3, r3, _start@l
        li      r4, (L1_CACHE_LINES * 2)
        mtctr   r4
1:      lwz     r5, 0(r3)
        addi    r3, r3, L1_CACHE_BYTES
        bdnz    1b
        sync
        blr

But this approach has some issues:

1. It should probably be made more platform-independent with respect to
d-cache size.  I'm not sure the best way to achieve that.

2. The _start address is the kernel virtual address, not the physical
address, but relocate() is running without TLB address translation
enabled.  Although we could easily circumvent this problem by clearing
the d-cache directly in relocate() (or in head_44x.S) using physical
addresses.

3. Chicken/egg issue:  the _start address might be stale because we
haven't yet flushed the d-cache and invalidated the i-cache.

I also discovered that calling flush_instruction_cache() from relocate()
wouldn't work for all platforms, for similar reasons.

We could overcome these issues with more code, but the added complexity
might not be worth it (premature optimization and all).  My original
patch at least has the benefit of being simple.


> 
> Another option is to, change the current mapping to 'Write Through' before
> calling the relocate() and revert back to the original setting after relocate().

True, that's another option.  Although since TLB handling is
platform-specific, I think it would have to be handled by the caller in
head_44x.S, rather than within relocate().


> > That also may suggest a bigger can of worms.  A grep of the powerpc code
> > shows many uses of KERNELBASE.  For a relocatable kernel, nobody should
> > be relying on KERNELBASE except for the early relocation code.  Are we
> > sure that all the other usages of KERNELBASE are "safe"?
> >
> I think we could simply replace the occurrences of KERNELBASE (after the relocate())
>   with '_stext' which would give the virtual start address of the kernel.

Yeah, that would work.


Josh
Benjamin Herrenschmidt Nov. 11, 2011, 4:11 a.m. UTC | #6
On Thu, 2011-11-10 at 08:01 +0530, Suzuki Poulose wrote:

> Oops ! You are right. We could go back to the clean_dcache_all() or the
> initial approach that you suggested. (dcbst).
> 
> I am not sure how do we flush the entire dcache(only). Could you post a
> patch which does the same ?
> 
> Another option is to, change the current mapping to 'Write Through' before
> calling the relocate() and revert back to the original setting after relocate().

Why not just keep the dcbst's & icbi's in relocate for now ? (original
patch) Is it noticably slower to boot ? If not I'd say keep it that way,
it will work on all implementations.

Cheers,
Ben.
> >
> >>
> >>
> >>> For i-cache invalidation there's already the (incorrectly named?)
> >>> flush_instruction_cache().  It uses the appropriate platform-specific
> >>> methods (e.g. iccci for 44x) to invalidate the entire i-cache.
> >>
> >> Agreed. The only thing that worries me is the use of KERNELBASE in the
> >> flush_instruction_cache() for CONFIG_4xx. Can we safely assume all 4xx
> >> implementations ignore the arguments passed to iccci ?
> >
> > Good question.  I don't know the answer. :-)
> >
> > That also may suggest a bigger can of worms.  A grep of the powerpc code
> > shows many uses of KERNELBASE.  For a relocatable kernel, nobody should
> > be relying on KERNELBASE except for the early relocation code.  Are we
> > sure that all the other usages of KERNELBASE are "safe"?
> >
> I think we could simply replace the occurrences of KERNELBASE (after the relocate())
>   with '_stext' which would give the virtual start address of the kernel.
> 
> Thanks
> Suzuki
> 
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
diff mbox

Patch

diff --git a/arch/powerpc/kernel/reloc_32.S
b/arch/powerpc/kernel/reloc_32.S index 045d61e..cce0510 100644
--- a/arch/powerpc/kernel/reloc_32.S
+++ b/arch/powerpc/kernel/reloc_32.S
@@ -33,10 +33,9 @@  R_PPC_RELATIVE = 22
 
 _GLOBAL(relocate)
 
-	mflr	r0
+	mflr	r14		/* Save our LR */
 	bl	0f		/* Find our current runtime
address */ 0:	mflr	r12		/* Make it
accessible */
-	mtlr	r0
 
 	lwz	r11, (p_dyn - 0b)(r12)