| Submitter | Andreas Bießmann |
|---|---|
| Date | Nov. 30, 2010, 7:06 a.m. |
| Message ID | <1291100800-61850-2-git-send-email-andreas.devel@googlemail.com> |
| Download | mbox | patch |
| Permalink | /patch/73576/ |
| State | RFC |
| Headers | show |
Comments
Hi Andreas, Le 30/11/2010 08:06, Andreas Bießmann a écrit : > diff --git a/arch/arm/cpu/arm920t/start.S b/arch/arm/cpu/arm920t/start.S > index 01edb9b..71de373 100644 > --- a/arch/arm/cpu/arm920t/start.S > +++ b/arch/arm/cpu/arm920t/start.S > @@ -208,15 +208,16 @@ stack_setup: > mov sp, r4 > > adr r0, _start > + cmp r0, r6 > + beq clear_bss /* skip relocation */ > + mov r1, r6 Why use r1? > ldr r2, _TEXT_BASE > ldr r3, _bss_start_ofs > add r2, r0, r3 /* r2<- source end address */ > - cmp r0, r6 > - beq clear_bss > > copy_loop: > ldmia r0!, {r9-r10} /* copy from source address [r0] */ > - stmia r6!, {r9-r10} /* copy to target address [r1] */ > + stmia r1!, {r9-r10} /* copy to target address [r1] */ Ditto. > cmp r0, r2 /* until source end address [r2] */ > blo copy_loop Amicalement,
Dear Albert ARIBAUD, Am 30.11.2010 09:07, schrieb Albert ARIBAUD: > Hi Andreas, > > Le 30/11/2010 08:06, Andreas Bießmann a écrit : > >> diff --git a/arch/arm/cpu/arm920t/start.S b/arch/arm/cpu/arm920t/start.S >> index 01edb9b..71de373 100644 >> --- a/arch/arm/cpu/arm920t/start.S >> +++ b/arch/arm/cpu/arm920t/start.S >> @@ -208,15 +208,16 @@ stack_setup: >> mov sp, r4 >> >> adr r0, _start >> + cmp r0, r6 >> + beq clear_bss /* skip relocation */ >> + mov r1, r6 > > Why use r1? Use a scratch register here cause stmia Rn! does increment Rn. > >> ldr r2, _TEXT_BASE >> ldr r3, _bss_start_ofs >> add r2, r0, r3 /* r2<- source end address */ >> - cmp r0, r6 >> - beq clear_bss >> >> copy_loop: >> ldmia r0!, {r9-r10} /* copy from source address >> [r0] */ >> - stmia r6!, {r9-r10} /* copy to target address [r1] */ >> + stmia r1!, {r9-r10} /* copy to target address [r1] */ Therefore usage of r6 here would destroy the saved 'addr of destination'. Cause that fact we have saved the 'addr of destination' @ beginning of relocate_code twice, once to r6 and once to r7. This is obviously not needed iv we change the register in copy_loop (which was already used in comment .. if you saw that on the end of the line). I doubt if this a 'speed up' but we can have a cleaner interface. We have r4, r5, r6 as storage of input values to relocate_code. We do only use scratch registers for copy_loop, fixloop for relocation, clear_bss later on. We do decide if we need fixloop for relocation or if we can skip that and then setup the relevant scratch registers. Well this is only an RFC. I found that plus the NULL pointer stuff worth to mention. It is not that important to get this special patch in cause the current implementation do also work. regards Andreas Bießmann
Le 30/11/2010 09:28, Andreas Bießmann a écrit : >>> + beq clear_bss /* skip relocation */ >>> + mov r1, r6 >> >> Why use r1? > > Use a scratch register here cause stmia Rn! does increment Rn. > Therefore usage of r6 here would destroy the saved 'addr of > destination'. Cause that fact we have saved the 'addr of destination' @ > beginning of relocate_code twice, once to r6 and once to r7. This is > obviously not needed iv we change the register in copy_loop (which was > already used in comment .. if you saw that on the end of the line). Understood. > I doubt if this a 'speed up' but we can have a cleaner interface. We > have r4, r5, r6 as storage of input values to relocate_code. We do only > use scratch registers for copy_loop, fixloop for relocation, clear_bss > later on. We do decide if we need fixloop for relocation or if we can > skip that and then setup the relevant scratch registers. > > Well this is only an RFC. I found that plus the NULL pointer stuff worth > to mention. It is not that important to get this special patch in cause > the current implementation do also work. The r8 issue and the zero-filled reloc entry issue are important to pull in as they can cause all sorts of time-wasting issues. > regards > > Andreas Bießmann Amicalement,
Patch
diff --git a/arch/arm/cpu/arm920t/start.S b/arch/arm/cpu/arm920t/start.S index 01edb9b..71de373 100644 --- a/arch/arm/cpu/arm920t/start.S +++ b/arch/arm/cpu/arm920t/start.S @@ -208,15 +208,16 @@ stack_setup: mov sp, r4 adr r0, _start + cmp r0, r6 + beq clear_bss /* skip relocation */ + mov r1, r6 ldr r2, _TEXT_BASE ldr r3, _bss_start_ofs add r2, r0, r3 /* r2 <- source end address */ - cmp r0, r6 - beq clear_bss copy_loop: ldmia r0!, {r9-r10} /* copy from source address [r0] */ - stmia r6!, {r9-r10} /* copy to target address [r1] */ + stmia r1!, {r9-r10} /* copy to target address [r1] */ cmp r0, r2 /* until source end address [r2] */ blo copy_loop
in case we are still at relocation target address before relocation we do not need to load the registers needed for relocation. We should instead skip the whole relocation part and jump over to clear_bss. Also prepare to not use target address twice. When we use a scratch register here r6 is unchanged and can be used later on. Signed-off-by: Andreas Bießmann <andreas.devel@googlemail.com> --- arch/arm/cpu/arm920t/start.S | 7 ++++--- 1 files changed, 4 insertions(+), 3 deletions(-)