diff mbox

[U-Boot] Reduce build times

Message ID 1320216842-29785-1-git-send-email-wd@denx.de
State Accepted
Headers show

Commit Message

Wolfgang Denk Nov. 2, 2011, 6:54 a.m. UTC
U-Boot Makefiles contain a number of tests for compiler features etc.
which so far are executed again and again.  On some architectures
(especially ARM) this results in a large number of calls to gcc.

This patch makes sure to run such tests only once, thus largely
reducing the number of "execve" system calls.

Example: number of "execve" system calls for building the "P2020DS"
(Power Architecture) and "qong" (ARM) boards, measured as:
	-> strace -f -e trace=execve -o /tmp/foo ./MAKEALL <board>
	-> grep execve /tmp/foo | wc -l

	Before: After:	Reduction:

Comments

Graeme Russ Nov. 2, 2011, 11:08 a.m. UTC | #1
Hi Wolfgang,

On 02/11/11 17:54, Wolfgang Denk wrote:
> U-Boot Makefiles contain a number of tests for compiler features etc.
> which so far are executed again and again.  On some architectures
> (especially ARM) this results in a large number of calls to gcc.
> 
> This patch makes sure to run such tests only once, thus largely
> reducing the number of "execve" system calls.
> 
> Example: number of "execve" system calls for building the "P2020DS"
> (Power Architecture) and "qong" (ARM) boards, measured as:
> 	-> strace -f -e trace=execve -o /tmp/foo ./MAKEALL <board>
> 	-> grep execve /tmp/foo | wc -l
> 
> 	Before: After:	Reduction:
> ==================================
> P2020DS 20555	15205	-26%
> qong	31692	14490	-54%
> 
> As a result, built times are significantly reduced, typically by
> 30...50%.
> 
> Signed-off-by: Wolfgang Denk <wd@denx.de>
> Cc: Andy Fleming <afleming@gmail.com>
> Cc: Kumar Gala <galak@kernel.crashing.org>
> Cc: Albert Aribaud <albert.aribaud@free.fr>
> cc: Graeme Russ <graeme.russ@gmail.com>
> cc: Mike Frysinger <vapier@gentoo.org>
> ---

Tested on x86, does what is written on the box ;)

Tested-by: Graeme Russ <graeme.russ@gmail.com>

Regards,

Graeme
Matthias Weisser Nov. 2, 2011, 12:35 p.m. UTC | #2
Am 02.11.2011 07:54, schrieb Wolfgang Denk:
> U-Boot Makefiles contain a number of tests for compiler features etc.
> which so far are executed again and again.  On some architectures
> (especially ARM) this results in a large number of calls to gcc.
>
> This patch makes sure to run such tests only once, thus largely
> reducing the number of "execve" system calls.
>
> Example: number of "execve" system calls for building the "P2020DS"
> (Power Architecture) and "qong" (ARM) boards, measured as:
> 	->  strace -f -e trace=execve -o /tmp/foo ./MAKEALL<board>
> 	->  grep execve /tmp/foo | wc -l
>
> 	Before: After:	Reduction:
> ==================================
> P2020DS 20555	15205	-26%
> qong	31692	14490	-54%
>
> As a result, built times are significantly reduced, typically by
> 30...50%.
>
> Signed-off-by: Wolfgang Denk<wd@denx.de>
> Cc: Andy Fleming<afleming@gmail.com>
> Cc: Kumar Gala<galak@kernel.crashing.org>
> Cc: Albert Aribaud<albert.aribaud@free.fr>
> cc: Graeme Russ<graeme.russ@gmail.com>
> cc: Mike Frysinger<vapier@gentoo.org>
> ---

Nice. Some additional numbers:

zmx25: make
-----------
real    1m47.546s 0m57.213s -53%
user    1m39.698s 0m54.831s
sys     0m24.798s 0m9.509s


zmx25: make -j2
---------------
real    0m56.791s 0m32.187s -57%
user    1m38.478s 0m55.571s
sys     0m24.522s 0m9.513s

Tested-by: Matthias Weisser <weisserm@arcor.de>

Matthias
Sanjeev Premi Nov. 2, 2011, 2:49 p.m. UTC | #3
> -----Original Message-----
> From: u-boot-bounces@lists.denx.de 
> [mailto:u-boot-bounces@lists.denx.de] On Behalf Of Wolfgang Denk
> Sent: Wednesday, November 02, 2011 12:24 PM
> To: u-boot@lists.denx.de
> Cc: Graeme Russ; Kumar Gala; Albert Aribaud; Andy Fleming
> Subject: [U-Boot] [PATCH] Reduce build times
> 
> U-Boot Makefiles contain a number of tests for compiler features etc.
> which so far are executed again and again.  On some architectures
> (especially ARM) this results in a large number of calls to gcc.
> 
> This patch makes sure to run such tests only once, thus largely
> reducing the number of "execve" system calls.
> 
> Example: number of "execve" system calls for building the "P2020DS"
> (Power Architecture) and "qong" (ARM) boards, measured as:
> 	-> strace -f -e trace=execve -o /tmp/foo ./MAKEALL <board>
> 	-> grep execve /tmp/foo | wc -l
> 
> 	Before: After:	Reduction:
> ==================================
> P2020DS 20555	15205	-26%
> qong	31692	14490	-54%
> 
> As a result, built times are significantly reduced, typically by
> 30...50%.
> 
> Signed-off-by: Wolfgang Denk <wd@denx.de>
> Cc: Andy Fleming <afleming@gmail.com>
> Cc: Kumar Gala <galak@kernel.crashing.org>
> Cc: Albert Aribaud <albert.aribaud@free.fr>
> cc: Graeme Russ <graeme.russ@gmail.com>
> cc: Mike Frysinger <vapier@gentoo.org>
> ---

Results for OMAP3EVM.
(Tried 5 times just to be sure as I see >50% reduction.)

	Before	After
	------	------
real	109.03	49.78	
user	 71.43	29.06
sys	 26.83	 7.66

Compiled u-boot works fine on the board as well.

Tested-by: Sanjeev Premi <premi@ti.com> 

[snip]...[snip]
Tom Rini Nov. 2, 2011, 3:37 p.m. UTC | #4
On Wed, Nov 2, 2011 at 7:49 AM, Premi, Sanjeev <premi@ti.com> wrote:
>> -----Original Message-----
>> From: u-boot-bounces@lists.denx.de
>> [mailto:u-boot-bounces@lists.denx.de] On Behalf Of Wolfgang Denk
>> Sent: Wednesday, November 02, 2011 12:24 PM
>> To: u-boot@lists.denx.de
>> Cc: Graeme Russ; Kumar Gala; Albert Aribaud; Andy Fleming
>> Subject: [U-Boot] [PATCH] Reduce build times
>>
>> U-Boot Makefiles contain a number of tests for compiler features etc.
>> which so far are executed again and again.  On some architectures
>> (especially ARM) this results in a large number of calls to gcc.
>>
>> This patch makes sure to run such tests only once, thus largely
>> reducing the number of "execve" system calls.
>>
>> Example: number of "execve" system calls for building the "P2020DS"
>> (Power Architecture) and "qong" (ARM) boards, measured as:
>>       -> strace -f -e trace=execve -o /tmp/foo ./MAKEALL <board>
>>       -> grep execve /tmp/foo | wc -l
>>
>>       Before: After:  Reduction:
>> ==================================
>> P2020DS 20555 15205   -26%
>> qong  31692   14490   -54%
>>
>> As a result, built times are significantly reduced, typically by
>> 30...50%.
>>
>> Signed-off-by: Wolfgang Denk <wd@denx.de>
>> Cc: Andy Fleming <afleming@gmail.com>
>> Cc: Kumar Gala <galak@kernel.crashing.org>
>> Cc: Albert Aribaud <albert.aribaud@free.fr>
>> cc: Graeme Russ <graeme.russ@gmail.com>
>> cc: Mike Frysinger <vapier@gentoo.org>
>> ---
>
> Results for OMAP3EVM.
> (Tried 5 times just to be sure as I see >50% reduction.)
>
>        Before  After
>        ------  ------
> real    109.03  49.78
> user     71.43  29.06
> sys      26.83   7.66

Over here omap3_evm wall-clock time on make -j12 goes from 27sec to 10sec.
Simon Glass Nov. 2, 2011, 6:20 p.m. UTC | #5
On Wed, Nov 2, 2011 at 8:37 AM, Tom Rini <tom.rini@gmail.com> wrote:
> On Wed, Nov 2, 2011 at 7:49 AM, Premi, Sanjeev <premi@ti.com> wrote:
>>> -----Original Message-----
>>> From: u-boot-bounces@lists.denx.de
>>> [mailto:u-boot-bounces@lists.denx.de] On Behalf Of Wolfgang Denk
>>> Sent: Wednesday, November 02, 2011 12:24 PM
>>> To: u-boot@lists.denx.de
>>> Cc: Graeme Russ; Kumar Gala; Albert Aribaud; Andy Fleming
>>> Subject: [U-Boot] [PATCH] Reduce build times
>>>
>>> U-Boot Makefiles contain a number of tests for compiler features etc.
>>> which so far are executed again and again.  On some architectures
>>> (especially ARM) this results in a large number of calls to gcc.
>>>
>>> This patch makes sure to run such tests only once, thus largely
>>> reducing the number of "execve" system calls.
>>>
>>> Example: number of "execve" system calls for building the "P2020DS"
>>> (Power Architecture) and "qong" (ARM) boards, measured as:
>>>       -> strace -f -e trace=execve -o /tmp/foo ./MAKEALL <board>
>>>       -> grep execve /tmp/foo | wc -l
>>>
>>>       Before: After:  Reduction:
>>> ==================================
>>> P2020DS 20555 15205   -26%
>>> qong  31692   14490   -54%
>>>
>>> As a result, built times are significantly reduced, typically by
>>> 30...50%.
>>>
>>> Signed-off-by: Wolfgang Denk <wd@denx.de>

Tested-by: Simon Glass <sjg@chromium.org>

>>> Cc: Andy Fleming <afleming@gmail.com>
>>> Cc: Kumar Gala <galak@kernel.crashing.org>
>>> Cc: Albert Aribaud <albert.aribaud@free.fr>
>>> cc: Graeme Russ <graeme.russ@gmail.com>
>>> cc: Mike Frysinger <vapier@gentoo.org>
>>> ---
>>
>> Results for OMAP3EVM.
>> (Tried 5 times just to be sure as I see >50% reduction.)
>>
>>        Before  After
>>        ------  ------
>> real    109.03  49.78
>> user     71.43  29.06
>> sys      26.83   7.66
>
> Over here omap3_evm wall-clock time on make -j12 goes from 27sec to 10sec.
>
> --
> Tom
> _______________________________________________
> U-Boot mailing list
> U-Boot@lists.denx.de
> http://lists.denx.de/mailman/listinfo/u-boot
>

For Tegra2 Seaboard (armv7) and -j15 or so: before and after times:

full build (clobber, config) 17.177s -> 7.060s
incremental build 7.432s -> 2.267s

Thank you!

Regards,
Simon
Daniel Schwierzeck Nov. 2, 2011, 6:57 p.m. UTC | #6
Hi Wolfgang,

On Wed, Nov 2, 2011 at 7:54 AM, Wolfgang Denk <wd@denx.de> wrote:
> U-Boot Makefiles contain a number of tests for compiler features etc.
> which so far are executed again and again.  On some architectures
> (especially ARM) this results in a large number of calls to gcc.
>
> This patch makes sure to run such tests only once, thus largely
> reducing the number of "execve" system calls.
>

maybe you want to try this experimental patch.
http://patchwork.ozlabs.org/patch/123313/

It significantly reduces the count of gcc calls by caching the results.
This also improves compilation times.

Best regards,
Daniel
Wolfgang Denk Nov. 2, 2011, 10:48 p.m. UTC | #7
Dear Daniel Schwierzeck,

In message <CACUy__UjmnRYKMWiMB9pqr0_dS6cgiyo-MsoVY4eSH2zT6ZKHA@mail.gmail.com> you wrote:
> 
> On Wed, Nov 2, 2011 at 7:54 AM, Wolfgang Denk <wd@denx.de> wrote:
> > U-Boot Makefiles contain a number of tests for compiler features etc.
> > which so far are executed again and again. =C2=A0On some architectures
> > (especially ARM) this results in a large number of calls to gcc.
> >
> > This patch makes sure to run such tests only once, thus largely
> > reducing the number of "execve" system calls.
> >
> 
> maybe you want to try this experimental patch.
> http://patchwork.ozlabs.org/patch/123313/
> 
> It significantly reduces the count of gcc calls by caching the results.
> This also improves compilation times.

Do you suggest this in addition or instead of the patch I posted?

Can you provide some measurements of build times and/or execve system
calls?

Best regards,

Wolfgang Denk
Daniel Schwierzeck Nov. 3, 2011, 1:33 a.m. UTC | #8
Hi Wolfgang,

On 02.11.2011 23:48, Wolfgang Denk wrote:
> Dear Daniel Schwierzeck,
>
> In message<CACUy__UjmnRYKMWiMB9pqr0_dS6cgiyo-MsoVY4eSH2zT6ZKHA@mail.gmail.com>  you wrote:
>>
>> On Wed, Nov 2, 2011 at 7:54 AM, Wolfgang Denk<wd@denx.de>  wrote:
>>> U-Boot Makefiles contain a number of tests for compiler features etc.
>>> which so far are executed again and again. =C2=A0On some architectures
>>> (especially ARM) this results in a large number of calls to gcc.
>>>
>>> This patch makes sure to run such tests only once, thus largely
>>> reducing the number of "execve" system calls.
>>>
>>
>> maybe you want to try this experimental patch.
>> http://patchwork.ozlabs.org/patch/123313/
>>
>> It significantly reduces the count of gcc calls by caching the results.
>> This also improves compilation times.
>
> Do you suggest this in addition or instead of the patch I posted?

as an additional but separate patch to further reduce the execution time 
of MAKEALL.

>
> Can you provide some measurements of build times and/or execve system
> calls?

I have attached the results of some MAKEALL runs in the patch mail (I 
cc-ed you).

Best regards,
Daniel
Macpaul Lin Nov. 3, 2011, 2:21 a.m. UTC | #9
HI Wolfgang,

2011/11/2 Wolfgang Denk <wd@denx.de>:
> U-Boot Makefiles contain a number of tests for compiler features etc.
> which so far are executed again and again.  On some architectures
> (especially ARM) this results in a large number of calls to gcc.

board          before   after     reduction
adp-ag101  7259      7059    2.7%

Tested-by: Macpaul Lin <macpaul@gmail.com>

Thanks
Mike Frysinger Nov. 3, 2011, 3:53 a.m. UTC | #10
On Wednesday 02 November 2011 02:54:02 Wolfgang Denk wrote:
> U-Boot Makefiles contain a number of tests for compiler features etc.
> which so far are executed again and again.  On some architectures
> (especially ARM) this results in a large number of calls to gcc.

seems to shave ~10% off for Blackfin boards
Acked-by: Mike Frysinger <vapier@gentoo.org>

> Note:  There is further potential for build time reductions by
> performing similar optimizations for a number of $(shell ...)
> constructs in the Makefiles, but I have no good ways to test these at
> the moment so this is left as exercise for the respective
> architecture maintainers (mostly blackfin and coldfire, AFAICT) -- wd

Blackfin does two $(shell), one of which i already cache.  the other, i should 
be able to send a patch for.
-mike
Daniel Schwierzeck Nov. 3, 2011, 3:25 p.m. UTC | #11
Hi Wolfgang,

On Wed, Nov 2, 2011 at 11:48 PM, Wolfgang Denk <wd@denx.de> wrote:
> Dear Daniel Schwierzeck,
>
> In message <CACUy__UjmnRYKMWiMB9pqr0_dS6cgiyo-MsoVY4eSH2zT6ZKHA@mail.gmail.com> you wrote:
>>
>> On Wed, Nov 2, 2011 at 7:54 AM, Wolfgang Denk <wd@denx.de> wrote:
>> > U-Boot Makefiles contain a number of tests for compiler features etc.
>> > which so far are executed again and again. =C2=A0On some architectures
>> > (especially ARM) this results in a large number of calls to gcc.
>> >
>> > This patch makes sure to run such tests only once, thus largely
>> > reducing the number of "execve" system calls.
>> >
>>
>> maybe you want to try this experimental patch.
>> http://patchwork.ozlabs.org/patch/123313/
>>
>> It significantly reduces the count of gcc calls by caching the results.
>> This also improves compilation times.
>
> Do you suggest this in addition or instead of the patch I posted?
>
> Can you provide some measurements of build times and/or execve system
> calls?

I ran some additonal tests with interesting results.

Board: ARM, Tegra2, seaboard
Toolchain: Sourcery G++ Lite 2011.03-41 for ARM GNU/Linux
Workstation: Core 2 Duo E6600 @2,4 Ghz, 4 GB, x86_64

I patched the cc-option macro to count all calls like this:

 cc-option = $(shell if $(CC) $(CFLAGS) $(1) -S -o /dev/null -xc /dev/null \
-               > /dev/null 2>&1; then echo "$(1)"; else echo "$(2)"; fi ;)
+               > /dev/null 2>&1; then echo "$(1)"; echo "$1" >>
$(OBJTREE)/cc-option; else echo "$(2)"; fi ;)

I ran the steps below for following source trees:
- unmodified HEAD
- only your patch
- only my patch
- both patches combined

Steps:
Complete build:
-> git clean -xdf
-> CROSS_COMPILE=/opt/codesourcery/arm-2011.03/bin/arm-none-linux-gnueabi-
make seaboard_config
-> time CROSS_COMPILE=/opt/codesourcery/arm-2011.03/bin/arm-none-linux-gnueabi-
CACHE_CC_OPTIONS=y make -s
-> cat cc-option | wc -l

Incremental rebuild:
-> time CROSS_COMPILE=/opt/codesourcery/arm-2011.03/bin/arm-none-linux-gnueabi-
CACHE_CC_OPTIONS=y make -s

Complete build with strace:
-> git clean -xdf
-> CROSS_COMPILE=/opt/codesourcery/arm-2011.03/bin/arm-none-linux-gnueabi-
make seaboard_config
-> CROSS_COMPILE=/opt/codesourcery/arm-2011.03/bin/arm-none-linux-gnueabi-
CACHE_CC_OPTIONS=y strace -f -e trace=execve -o strace.out make -s
-> grep execve strace.out | wc -l

Results:
unmodified HEAD:
real	1m11.540s
user	2m7.170s
sys	0m19.840s

cc-option calls 3024

real	0m20.176s
user	0m39.260s
sys	0m6.480s

execve calls 16502

only your patch:
real	0m32.371s
user	0m47.440s
sys	0m7.900s

cc-option calls 864

real	0m9.606s
user	0m16.890s
sys	0m2.940s

execve calls 5906

only my patch:
real	0m28.187s
user	0m56.030s
sys	0m7.820s

cc-option calls 20

real	0m5.013s
user	0m13.300s
sys	0m2.200s

execve calls 7415

both patches combined:
real	0m19.777s
user	0m28.010s
sys	0m4.100s

cc-option calls 8

real	0m2.902s
user	0m6.400s
sys	0m1.070s

execve calls 3329

Conclusion:
- complete build time reduced from 1m11s to 20s
- incremental rebuild time reduced from 20s to 3s
- cc-option calls reduced from 3024 to 8
- execve calls reduced from 16502 to 3329

Best regards,
Daniel
Wolfgang Denk Nov. 3, 2011, 7:49 p.m. UTC | #12
Dear Daniel Schwierzeck,

In message <CACUy__W_Z85aLiNUQXMxE3trrHm4auEqOBXBqs6DfSRFEPh9CA@mail.gmail.com> you wrote:
> 
> Conclusion:
> - complete build time reduced from 1m11s to 20s
> - incremental rebuild time reduced from 20s to 3s
> - cc-option calls reduced from 3024 to 8
> - execve calls reduced from 16502 to 3329

That's really cool.

Can we please add another two or three of such optimizations? :-)

Best regards,

Wolfgang Denk
Aneesh V Nov. 4, 2011, 2:01 a.m. UTC | #13
Hi Daniel, Wolfgang,

On Thursday 03 November 2011 08:55 PM, Daniel Schwierzeck wrote:
> Hi Wolfgang,
[snip ..]
>
> Conclusion:
> - complete build time reduced from 1m11s to 20s
> - incremental rebuild time reduced from 20s to 3s
> - cc-option calls reduced from 3024 to 8
> - execve calls reduced from 16502 to 3329

Results for omap4 sdp build:
Build machine: Intel Core i5 2.5 GHz, 3M cache, 4GB DDR3

Un-modified HEAD:
real	0m21.463s
user	0m31.278s
sys	0m9.281s

With only Wolfgang's patch:
real	0m11.226s
user	0m23.937s
sys	0m4.200s

With only Daniel's patch:
real	0m10.842s
user	0m21.725s
sys	0m2.532s

With both patches:
real	0m8.306s
user	0m21.201s
sys	0m2.408s

Looks like both patches are helping. Thanks!!

br,
Aneesh

>
> Best regards,
> Daniel
> _______________________________________________
> U-Boot mailing list
> U-Boot@lists.denx.de
> http://lists.denx.de/mailman/listinfo/u-boot
diff mbox

Patch

==================================
P2020DS 20555	15205	-26%
qong	31692	14490	-54%

As a result, built times are significantly reduced, typically by
30...50%.

Signed-off-by: Wolfgang Denk <wd@denx.de>
Cc: Andy Fleming <afleming@gmail.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Albert Aribaud <albert.aribaud@free.fr>
cc: Graeme Russ <graeme.russ@gmail.com>
cc: Mike Frysinger <vapier@gentoo.org>
---

More detailled build results:

1) Number of "execve" system calls for two exemplary boards:
   "P2020DS" for Power Architecture, and "qong" for ARM.
   Measured as:
   -> strace -f -e trace=execve -o /tmp/foo ./MAKEALL <board>
   -> grep execve /tmp/foo | wc -l

                Before: After:  Reduction:
   =======================================
   P2020DS      20555   15205   -26%
   qong         31692   14490   -54%

2) Build time for single boards:
   Measured as "time ./MAKEALL <board>" (average over 3 runs after a
   dummy run to populate the file system; using ELDK 4.2 tool chain;
   measured for "P2020DS" and "qong" on a Core2 Quad CPU at 2.83GHz,
   for "m28evk" on a i.MX28 at 454MHz over NFS)

                Before: After:  Reduction:
   =======================================
   P2020DS
       real     29.429s 19.494s -34%
       user     64.035s 48.621s -24%
       sys      16.188s  9.203s -43%
   qong:
       real     34.274s 16.263s -53%
       user     65.014s 39.516s -39%
       sys      27.606s  9.551s -65%
   m28evk:
       real     45.752m 27.606m -40%
       user     24.868m 17.511m -30%
       sys      17.017m  6.642m -61%


3) Build time for MAKEALL:
   Measured as "time MAKEALL_LOGDIR=/work/wd/tmp-LOG
   BUILD_DIR=/work/wd/tmp ./MAKEALL <arch>" (using ELDK 4.2 tool
   chain; measured for "ppc" and "arm" on a Core i7 at 3.07GHz)

                Before: After:  Reduction:
   =======================================
   ppc
       real    82.063m  66.595m	-19%
       user   261.710m 231.429m	-12%
       sys     61.739m	49.193m	-20%
   arm
       real    55.269m  20.763m	-62%
       user    84.302m  49.154m	-42%
       sys     38.933m  15.028m	-61%


Note:  There is further potential for build time reductions by
performing similar optimizations for a number of $(shell ...)
constructs in the Makefiles, but I have no good ways to test these at
the moment so this is left as exercise for the respective
architecture maintainers (mostly blackfin and coldfire, AFAICT) -- wd

 Makefile                                 |    2 +-
 arch/arm/config.mk                       |   19 ++++++++++---------
 arch/arm/cpu/arm1136/config.mk           |    3 ++-
 arch/arm/cpu/arm1176/config.mk           |    4 +++-
 arch/arm/cpu/arm1176/s3c64xx/config.mk   |    4 +++-
 arch/arm/cpu/arm720t/config.mk           |    4 +++-
 arch/arm/cpu/arm920t/config.mk           |    3 ++-
 arch/arm/cpu/arm925t/config.mk           |    3 ++-
 arch/arm/cpu/arm926ejs/at91/config.mk    |    3 ++-
 arch/arm/cpu/arm926ejs/config.mk         |    3 ++-
 arch/arm/cpu/arm946es/config.mk          |    3 ++-
 arch/arm/cpu/arm_intcm/config.mk         |    3 ++-
 arch/arm/cpu/armv7/config.mk             |    4 ++--
 arch/arm/cpu/armv7/omap-common/config.mk |    5 +++--
 arch/arm/cpu/ixp/config.mk               |    3 ++-
 arch/arm/cpu/lh7a40x/config.mk           |    3 ++-
 arch/arm/cpu/pxa/config.mk               |    3 ++-
 arch/arm/cpu/s3c44b0/config.mk           |    3 ++-
 arch/arm/cpu/sa1100/config.mk            |    3 ++-
 arch/powerpc/cpu/mpc824x/Makefile        |    3 +--
 arch/powerpc/cpu/mpc85xx/config.mk       |    5 +++--
 arch/x86/config.mk                       |   10 ++++++----
 board/siemens/SCM/Makefile               |    3 +--
 config.mk                                |    8 +++++---
 examples/standalone/Makefile             |    3 ++-
 25 files changed, 67 insertions(+), 43 deletions(-)

diff --git a/Makefile b/Makefile
index 9ef33f9..82de62b 100644
--- a/Makefile
+++ b/Makefile
@@ -320,7 +320,7 @@  else
 PLATFORM_LIBGCC = -L $(USE_PRIVATE_LIBGCC) -lgcc
 endif
 else
-PLATFORM_LIBGCC = -L $(shell dirname `$(CC) $(CFLAGS) -print-libgcc-file-name`) -lgcc
+PLATFORM_LIBGCC := -L $(shell dirname `$(CC) $(CFLAGS) -print-libgcc-file-name`) -lgcc
 endif
 PLATFORM_LIBS += $(PLATFORM_LIBGCC)
 export PLATFORM_LIBS
diff --git a/arch/arm/config.mk b/arch/arm/config.mk
index 9b4e581..45f9dca 100644
--- a/arch/arm/config.mk
+++ b/arch/arm/config.mk
@@ -34,7 +34,7 @@  endif
 PLATFORM_CPPFLAGS += -DCONFIG_ARM -D__ARM__
 
 # Explicitly specifiy 32-bit ARM ISA since toolchain default can be -mthumb:
-PLATFORM_CPPFLAGS += $(call cc-option,-marm,)
+PF_CPPFLAGS_ARM := $(call cc-option,-marm,)
 
 # Try if EABI is supported, else fall back to old API,
 # i. e. for example:
@@ -44,15 +44,16 @@  PLATFORM_CPPFLAGS += $(call cc-option,-marm,)
 #	-mabi=apcs-gnu -mno-thumb-interwork
 # - with ELDK 3.1 (gcc 3.x), use:
 #	-mapcs-32 -mno-thumb-interwork
-PLATFORM_CPPFLAGS += $(call cc-option,\
-				-mabi=aapcs-linux -mno-thumb-interwork,\
+PF_CPPFLAGS_ABI := $(call cc-option,\
+			-mabi=aapcs-linux -mno-thumb-interwork,\
+			$(call cc-option,\
+				-mapcs-32,\
 				$(call cc-option,\
-					-mapcs-32,\
-					$(call cc-option,\
-						-mabi=apcs-gnu,\
-					)\
-				) $(call cc-option,-mno-thumb-interwork,)\
-			)
+					-mabi=apcs-gnu,\
+				)\
+			) $(call cc-option,-mno-thumb-interwork,)\
+		)
+PLATFORM_CPPFLAGS += $(PF_CPPFLAGS_ARM) $(PF_CPPFLAGS_ABI)
 
 # For EABI, make sure to provide raise()
 ifneq (,$(findstring -mabi=aapcs-linux,$(PLATFORM_CPPFLAGS)))
diff --git a/arch/arm/cpu/arm1136/config.mk b/arch/arm/cpu/arm1136/config.mk
index 3e68535..efee0d1 100644
--- a/arch/arm/cpu/arm1136/config.mk
+++ b/arch/arm/cpu/arm1136/config.mk
@@ -29,4 +29,5 @@  PLATFORM_CPPFLAGS += -march=armv5
 # Supply options according to compiler version
 #
 # =========================================================================
-PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT)
diff --git a/arch/arm/cpu/arm1176/config.mk b/arch/arm/cpu/arm1176/config.mk
index 14346cf..222d352 100644
--- a/arch/arm/cpu/arm1176/config.mk
+++ b/arch/arm/cpu/arm1176/config.mk
@@ -29,4 +29,6 @@  PLATFORM_CPPFLAGS += -march=armv5t
 # Supply options according to compiler version
 #
 # =========================================================================
-PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,\
+			$(call cc-option,-malignment-traps,))
+PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT)
diff --git a/arch/arm/cpu/arm1176/s3c64xx/config.mk b/arch/arm/cpu/arm1176/s3c64xx/config.mk
index 14346cf..222d352 100644
--- a/arch/arm/cpu/arm1176/s3c64xx/config.mk
+++ b/arch/arm/cpu/arm1176/s3c64xx/config.mk
@@ -29,4 +29,6 @@  PLATFORM_CPPFLAGS += -march=armv5t
 # Supply options according to compiler version
 #
 # =========================================================================
-PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,\
+			$(call cc-option,-malignment-traps,))
+PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT)
diff --git a/arch/arm/cpu/arm720t/config.mk b/arch/arm/cpu/arm720t/config.mk
index 3844c62..210c6dc 100644
--- a/arch/arm/cpu/arm720t/config.mk
+++ b/arch/arm/cpu/arm720t/config.mk
@@ -30,4 +30,6 @@  PLATFORM_CPPFLAGS += -march=armv4 -mtune=arm7tdmi
 # Supply options according to compiler version
 #
 # =========================================================================
-PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,\
+			$(call cc-option,-malignment-traps,))
+PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT)
diff --git a/arch/arm/cpu/arm920t/config.mk b/arch/arm/cpu/arm920t/config.mk
index 8f6c1a3..f03030a 100644
--- a/arch/arm/cpu/arm920t/config.mk
+++ b/arch/arm/cpu/arm920t/config.mk
@@ -29,4 +29,5 @@  PLATFORM_CPPFLAGS += -march=armv4
 # Supply options according to compiler version
 #
 # =========================================================================
-PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT)
diff --git a/arch/arm/cpu/arm925t/config.mk b/arch/arm/cpu/arm925t/config.mk
index 8f6c1a3..f03030a 100644
--- a/arch/arm/cpu/arm925t/config.mk
+++ b/arch/arm/cpu/arm925t/config.mk
@@ -29,4 +29,5 @@  PLATFORM_CPPFLAGS += -march=armv4
 # Supply options according to compiler version
 #
 # =========================================================================
-PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT)
diff --git a/arch/arm/cpu/arm926ejs/at91/config.mk b/arch/arm/cpu/arm926ejs/at91/config.mk
index 19296fd..370630d 100644
--- a/arch/arm/cpu/arm926ejs/at91/config.mk
+++ b/arch/arm/cpu/arm926ejs/at91/config.mk
@@ -1 +1,2 @@ 
-PLATFORM_CPPFLAGS += $(call cc-option,-mtune=arm926ejs,)
+PF_CPPFLAGS_TUNE := $(call cc-option,-mtune=arm926ejs,)
+PLATFORM_CPPFLAGS += $(PF_CPPFLAGS_TUNE)
diff --git a/arch/arm/cpu/arm926ejs/config.mk b/arch/arm/cpu/arm926ejs/config.mk
index f8ef90f..ffb2e6c 100644
--- a/arch/arm/cpu/arm926ejs/config.mk
+++ b/arch/arm/cpu/arm926ejs/config.mk
@@ -29,4 +29,5 @@  PLATFORM_CPPFLAGS += -march=armv5te
 # Supply options according to compiler version
 #
 # =========================================================================
-PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT)
diff --git a/arch/arm/cpu/arm946es/config.mk b/arch/arm/cpu/arm946es/config.mk
index e783f69..c2354ba 100644
--- a/arch/arm/cpu/arm946es/config.mk
+++ b/arch/arm/cpu/arm946es/config.mk
@@ -29,4 +29,5 @@  PLATFORM_CPPFLAGS +=  -march=armv4
 # Supply options according to compiler version
 #
 # =========================================================================
-PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT)
diff --git a/arch/arm/cpu/arm_intcm/config.mk b/arch/arm/cpu/arm_intcm/config.mk
index e783f69..c2354ba 100644
--- a/arch/arm/cpu/arm_intcm/config.mk
+++ b/arch/arm/cpu/arm_intcm/config.mk
@@ -29,4 +29,5 @@  PLATFORM_CPPFLAGS +=  -march=armv4
 # Supply options according to compiler version
 #
 # =========================================================================
-PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT)
diff --git a/arch/arm/cpu/armv7/config.mk b/arch/arm/cpu/armv7/config.mk
index 49ac9c7..83ddf10 100644
--- a/arch/arm/cpu/armv7/config.mk
+++ b/arch/arm/cpu/armv7/config.mk
@@ -29,5 +29,5 @@  PLATFORM_CPPFLAGS += -march=armv5
 # Supply options according to compiler version
 #
 # =========================================================================
-PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,\
-		    $(call cc-option,-malignment-traps,))
+PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT)
diff --git a/arch/arm/cpu/armv7/omap-common/config.mk b/arch/arm/cpu/armv7/omap-common/config.mk
index 49ac9c7..c400dcc 100644
--- a/arch/arm/cpu/armv7/omap-common/config.mk
+++ b/arch/arm/cpu/armv7/omap-common/config.mk
@@ -29,5 +29,6 @@  PLATFORM_CPPFLAGS += -march=armv5
 # Supply options according to compiler version
 #
 # =========================================================================
-PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,\
-		    $(call cc-option,-malignment-traps,))
+PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,\
+			$(call cc-option,-malignment-traps,))
+PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT)
diff --git a/arch/arm/cpu/ixp/config.mk b/arch/arm/cpu/ixp/config.mk
index 5868cba..9149665 100644
--- a/arch/arm/cpu/ixp/config.mk
+++ b/arch/arm/cpu/ixp/config.mk
@@ -37,4 +37,5 @@  LDFLAGS_u-boot += --gc-sections
 # Supply options according to compiler version
 #
 # =========================================================================
-PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT)
diff --git a/arch/arm/cpu/lh7a40x/config.mk b/arch/arm/cpu/lh7a40x/config.mk
index 47b2b7b..1c4aa97 100644
--- a/arch/arm/cpu/lh7a40x/config.mk
+++ b/arch/arm/cpu/lh7a40x/config.mk
@@ -29,4 +29,5 @@  PLATFORM_CPPFLAGS += -march=armv4
 # Supply options according to compiler version
 #
 # ========================================================================
-PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT)
diff --git a/arch/arm/cpu/pxa/config.mk b/arch/arm/cpu/pxa/config.mk
index a05d69c..0bbe295 100644
--- a/arch/arm/cpu/pxa/config.mk
+++ b/arch/arm/cpu/pxa/config.mk
@@ -30,4 +30,5 @@  PLATFORM_CPPFLAGS += -march=armv5te -mtune=xscale
 # Supply options according to compiler version
 #
 # ========================================================================
-PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT)
diff --git a/arch/arm/cpu/s3c44b0/config.mk b/arch/arm/cpu/s3c44b0/config.mk
index 7454d72..f6f6398 100644
--- a/arch/arm/cpu/s3c44b0/config.mk
+++ b/arch/arm/cpu/s3c44b0/config.mk
@@ -30,4 +30,5 @@  PLATFORM_CPPFLAGS += -march=armv4 -mtune=arm7tdmi -msoft-float
 # Supply options according to compiler version
 #
 # ========================================================================
-PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT)
diff --git a/arch/arm/cpu/sa1100/config.mk b/arch/arm/cpu/sa1100/config.mk
index 6f21f41..06af160 100644
--- a/arch/arm/cpu/sa1100/config.mk
+++ b/arch/arm/cpu/sa1100/config.mk
@@ -30,4 +30,5 @@  PLATFORM_CPPFLAGS += -march=armv4 -mtune=strongarm1100
 # Supply options according to compiler version
 #
 # ========================================================================
-PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
+PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT)
diff --git a/arch/powerpc/cpu/mpc824x/Makefile b/arch/powerpc/cpu/mpc824x/Makefile
index 2bfcd85..ebf4cb2 100644
--- a/arch/powerpc/cpu/mpc824x/Makefile
+++ b/arch/powerpc/cpu/mpc824x/Makefile
@@ -23,8 +23,7 @@ 
 
 include $(TOPDIR)/config.mk
 ifneq ($(OBJTREE),$(SRCTREE))
-$(shell mkdir -p $(obj)drivers/epic)
-$(shell mkdir -p $(obj)drivers/i2c)
+$(shell mkdir -p $(obj)drivers/epic $(obj)drivers/i2c)
 endif
 
 LIB	= $(obj)lib$(CPU).o
diff --git a/arch/powerpc/cpu/mpc85xx/config.mk b/arch/powerpc/cpu/mpc85xx/config.mk
index 68ac57d..f36d823 100644
--- a/arch/powerpc/cpu/mpc85xx/config.mk
+++ b/arch/powerpc/cpu/mpc85xx/config.mk
@@ -28,5 +28,6 @@  PLATFORM_CPPFLAGS += -ffixed-r2 -Wa,-me500 -msoft-float -mno-string
 # -mspe=yes is needed to have -mno-spe accepted by a buggy GCC;
 # see "[PATCH,rs6000] make -mno-spe work as expected" on
 # http://gcc.gnu.org/ml/gcc-patches/2008-04/msg00311.html
-PLATFORM_CPPFLAGS +=$(call cc-option,-mspe=yes)
-PLATFORM_CPPFLAGS +=$(call cc-option,-mno-spe)
+PF_CPPFLAGS_SPE := $(call cc-option,-mspe=yes) \
+		   $(call cc-option,-mno-spe)
+PLATFORM_CPPFLAGS += $(PF_CPPFLAGS_SPE)
diff --git a/arch/x86/config.mk b/arch/x86/config.mk
index ee23c9f..fe9083f 100644
--- a/arch/x86/config.mk
+++ b/arch/x86/config.mk
@@ -27,10 +27,12 @@  PLATFORM_CPPFLAGS += -fno-strict-aliasing
 PLATFORM_CPPFLAGS += -Wstrict-prototypes
 PLATFORM_CPPFLAGS += -mregparm=3
 PLATFORM_CPPFLAGS += -fomit-frame-pointer
-PLATFORM_CPPFLAGS += $(call cc-option, -ffreestanding)
-PLATFORM_CPPFLAGS += $(call cc-option, -fno-toplevel-reorder,  $(call cc-option, -fno-unit-at-a-time))
-PLATFORM_CPPFLAGS += $(call cc-option, -fno-stack-protector)
-PLATFORM_CPPFLAGS += $(call cc-option, -mpreferred-stack-boundary=2)
+PF_CPPFLAGS_X86   := $(call cc-option, -ffreestanding) \
+		     $(call cc-option, -fno-toplevel-reorder, \
+		       $(call cc-option, -fno-unit-at-a-time)) \
+		     $(call cc-option, -fno-stack-protector) \
+		     $(call cc-option, -mpreferred-stack-boundary=2)
+PLATFORM_CPPFLAGS += $(PF_CPPFLAGS_X86)
 PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm
 PLATFORM_CPPFLAGS += -DREALMODE_BASE=0x7c0
 
diff --git a/board/siemens/SCM/Makefile b/board/siemens/SCM/Makefile
index 07cc5a6..07db9d4 100644
--- a/board/siemens/SCM/Makefile
+++ b/board/siemens/SCM/Makefile
@@ -24,8 +24,7 @@ 
 include $(TOPDIR)/config.mk
 
 ifneq ($(OBJTREE),$(SRCTREE))
-$(shell mkdir -p $(obj)../common)
-$(shell mkdir -p $(obj)../../tqc/tqm8xx)
+$(shell mkdir -p $(obj)../common $(obj)../../tqc/tqm8xx)
 endif
 
 LIB	= $(obj)lib$(BOARD).o
diff --git a/config.mk b/config.mk
index 11b67e5..918cffe 100644
--- a/config.mk
+++ b/config.mk
@@ -209,11 +209,13 @@  else
 CFLAGS := $(CPPFLAGS) -Wall -Wstrict-prototypes
 endif
 
-CFLAGS += $(call cc-option,-fno-stack-protector)
+CFLAGS_SSP := $(call cc-option,-fno-stack-protector)
+CFLAGS += $(CFLAGS_SSP)
 # Some toolchains enable security related warning flags by default,
 # but they don't make much sense in the u-boot world, so disable them.
-CFLAGS += $(call cc-option,-Wno-format-nonliteral)
-CFLAGS += $(call cc-option,-Wno-format-security)
+CFLAGS_WARN := $(call cc-option,-Wno-format-nonliteral) \
+	       $(call cc-option,-Wno-format-security)
+CFLAGS += $(CFLAGS_WARN)
 
 # $(CPPFLAGS) sets -g, which causes gcc to pass a suitable -g<format>
 # option to the assembler.
diff --git a/examples/standalone/Makefile b/examples/standalone/Makefile
index b1e33fb..e23865b 100644
--- a/examples/standalone/Makefile
+++ b/examples/standalone/Makefile
@@ -85,7 +85,8 @@  endif
 # We don't want gcc reordering functions if possible.  This ensures that an
 # application's entry point will be the first function in the application's
 # source file.
-CFLAGS += $(call cc-option,-fno-toplevel-reorder)
+CFLAGS_NTR := $(call cc-option,-fno-toplevel-reorder)
+CFLAGS += $(CFLAGS_NTR)
 
 all:	$(obj).depend $(OBJS) $(LIB) $(SREC) $(BIN) $(ELF)