[U-Boot] Allow for parallel builds and saved output

Message ID	1320305309-24959-1-git-send-email-afleming@freescale.com
State	Superseded
Delegated to:	Wolfgang Denk
Headers	show Return-Path: <u-boot-bounces@lists.denx.de> From: Andy Fleming <afleming@freescale.com> To: <wd@denx.de> Date: Thu, 3 Nov 2011 02:28:29 -0500 Message-ID: <1320305309-24959-1-git-send-email-afleming@freescale.com> MIME-Version: 1.0 Cc: u-boot@lists.denx.de Subject: [U-Boot] [PATCH] Allow for parallel builds and saved output Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: u-boot-bounces@lists.denx.de Errors-To: u-boot-bounces@lists.denx.de

Andy Fleming Nov. 3, 2011, 7:28 a.m. UTC

The MAKEALL script cleverly runs make with the appropriate options
to use all of the cores on the system, but your average U-Boot build
can't make much use of more than a few cores.  If you happen to have
a many-core server, your builds will leave most of the system idle.

In order to make full use of such a system, we need to build multiple
targets in parallel, and this requires directing make output into
multiple directories. We add a BUILD_NBUILDS variable, which allows
users to specify how many builds to run in parallel. I've found that
16 is too many on my system (fork starts to fail). When BUILD_NBUILDS
is set greater than 1, we redefine BUILD_DIR for each build to be
${BUILD_DIR}/${target}. Also, we make "./build" the default BUILD_DIR
when BUILD_NBUILDS is greater than 1.

Once each build finishes, we run make clean on its directory, to reduce
the footprint, and also remove all .depend files.

As a result, we are left with a build directory with all of the built
targets still there for use, which means anyone who wanted to use
MAKEALL as part of a test harness can now do so.

Signed-off-by: Andy Fleming <afleming@freescale.com>
---
Inspired by all the MAKEALL improvements, I decided to clean up my old
one for parallel builds. I think this version addresses the concerns
raised last time...

 MAKEALL |   74 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--------
 1 files changed, 64 insertions(+), 10 deletions(-)

Simon Glass Nov. 3, 2011, 4:22 p.m. UTC | #1

Hi Andy,

On Thu, Nov 3, 2011 at 12:28 AM, Andy Fleming <afleming@freescale.com> wrote:
> The MAKEALL script cleverly runs make with the appropriate options
> to use all of the cores on the system, but your average U-Boot build
> can't make much use of more than a few cores.  If you happen to have
> a many-core server, your builds will leave most of the system idle.
>
> In order to make full use of such a system, we need to build multiple
> targets in parallel, and this requires directing make output into
> multiple directories. We add a BUILD_NBUILDS variable, which allows
> users to specify how many builds to run in parallel. I've found that
> 16 is too many on my system (fork starts to fail). When BUILD_NBUILDS
> is set greater than 1, we redefine BUILD_DIR for each build to be
> ${BUILD_DIR}/${target}. Also, we make "./build" the default BUILD_DIR
> when BUILD_NBUILDS is greater than 1.
>
> Once each build finishes, we run make clean on its directory, to reduce
> the footprint, and also remove all .depend files.
>
> As a result, we are left with a build directory with all of the built
> targets still there for use, which means anyone who wanted to use
> MAKEALL as part of a test harness can now do so.
>
> Signed-off-by: Andy Fleming <afleming@freescale.com>
> ---
> Inspired by all the MAKEALL improvements, I decided to clean up my old
> one for parallel builds. I think this version addresses the concerns
> raised last time...

Thanks very much for posting this. The low CPU utilization is
something that has bugged me - I get maybe 20% at best.  I was
thinking about writing a little script which fires off one job as soon
as the first complete, and keeps maybe 20 jobs running at once. This
patch does most of that, with the limitation that jobs that finish
early can leave idle CPU time.

With your patch I get at least 70% CPU averaged over the whole
MAKEALL, and all the ARM boards build in 5 minutes (with Wolfgang and
Daniel Schwierzeck's patches also). The only proviso is that you need
to set BUILD_NCPUS to 1 I think - maybe your patch should do that also
? Otherwise I got a load average of 220...

If you get back to this sometime the other little wish is that it
could perhaps display the output of the build together, perhaps
buffering it in a log file first. So instead of:

Configuring for imx31_phycore_eet - Board: imx31_phycore, Options:
IMX31_PHYCORE_EET
Configuring for imx31_phycore_eet - Board: imx31_phycore, Options:
IMX31_PHYCORE_EET
Configuring for imx31_phycore board...
board.c:389: error: 'MACH_TYPE_TIAM335EVM' undeclared (first use in
this function)
board.c:389: error: (Each undeclared identifier is reported only once
board.c:389: error: for each function it appears in.)
make[1]: *** [/c/cosarm/buildall/u-boot/build/am335x_evm/arch/arm/lib/board.o]
Error 1
make: *** [/c/cosarm/buildall/u-boot/build/am335x_evm/arch/arm/lib/libarm.o]
Error 2
arm-none-linux-gnueabi-size: './build/am335x_evm/u-boot': No such file
cmd_version.c:29: error: expected ',' or ';' before 'U_BOOT_DATE'
   text	   data	    bss	    dec	    hex	filename
 176792	   3144	 223300	 403236	  62724	./build/ca9x4_ct_vxp/u-boot
   text	   data	    bss	    dec	    hex	filename
 156236	   2788	  10776	 169800	  29748	./build/highbank/u-boot

it could do:
Configuring for <board1>
warnings/errors for board1
size info for board1

Configuring for <board2>
warnings/errors for board2
size info for board2

...

Tested-by: Simon Glass <sjg@chromium.org>

Regards,
Simon

>
>  MAKEALL |   74 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--------
>  1 files changed, 64 insertions(+), 10 deletions(-)
>
> diff --git a/MAKEALL b/MAKEALL
> index 95b7cd3..4583ddb 100755
> --- a/MAKEALL
> +++ b/MAKEALL
> @@ -31,6 +31,7 @@ usage()
>          CROSS_COMPILE    cross-compiler toolchain prefix (default: "")
>          MAKEALL_LOGDIR   output all logs to here (default: ./LOG/)
>          BUILD_DIR        output build directory (default: ./)
> +         BUILD_NBUILDS    number of parallel targets (default: 1)
>
>        Examples:
>          - build all Power Architecture boards:
> @@ -160,10 +161,23 @@ else
>        LOG_DIR="LOG"
>  fi
>
> +if [ ! "${BUILD_NBUILDS}" ] ; then
> +       BUILD_NBUILDS="1"
> +fi
> +
> +if [ "$BUILD_NBUILDS" -gt 1 ] ; then
> +       if [ ! "${BUILD_DIR}" ] ; then
> +               BUILD_DIR="./build"
> +       fi
> +       mkdir -p ${BUILD_DIR}/ERR
> +fi
> +
>  if [ ! "${BUILD_DIR}" ] ; then
>        BUILD_DIR="."
>  fi
>
> +OUTPUT_PREFIX="${BUILD_DIR}"
> +
>  [ -d ${LOG_DIR} ] || mkdir ${LOG_DIR} || exit 1
>
>  LIST=""
> @@ -483,32 +497,52 @@ LIST_sparc="$(boards_by_arch sparc)"
>
>  LIST_nds32="$(boards_by_arch nds32)"
>
> +CURRENT_COUNT=0
> +
>  #-----------------------------------------------------------------------
>
>  build_target() {
>        target=$1
>
> +       if [ "$BUILD_NBUILDS" -gt 1 ] ; then
> +               output_dir="${OUTPUT_PREFIX}/${target}"
> +               mkdir -p ${output_dir}
> +       else
> +               output_dir="${OUTPUT_PREFIX}"
> +       fi
> +
> +       export BUILD_DIR="${output_dir}"
> +
>        ${MAKE} distclean >/dev/null
>        ${MAKE} -s ${target}_config
>
> -       ${MAKE} ${JOBS} all 2>&1 >${LOG_DIR}/$target.MAKELOG \
> -                               | tee ${LOG_DIR}/$target.ERR
> +       ${MAKE} ${JOBS} all \
> +               2>&1 >${LOG_DIR}/$target.MAKELOG | tee ${LOG_DIR}/$target.ERR
>
>        # Check for 'make' errors
>        if [ ${PIPESTATUS[0]} -ne 0 ] ; then
>                RC=1
>        fi
>
> -       if [ -s ${LOG_DIR}/$target.ERR ] ; then
> -               ERR_CNT=$((ERR_CNT + 1))
> -               ERR_LIST="${ERR_LIST} $target"
> +       if [ "$BUILD_NBUILDS" -gt 1 ] ; then
> +               ${MAKE} clean
> +               find "${output_dir}" -type f -name '*.depend' | xargs rm
> +
> +               if [ -s ${LOG_DIR}/${target}.ERR ] ; then
> +                       touch ${OUTPUT_PREFIX}/ERR/${target}
> +               else
> +                       rm ${LOG_DIR}/${target}.ERR
> +               fi
>        else
> -               rm ${LOG_DIR}/$target.ERR
> +               if [ -s ${LOG_DIR}/${target}.ERR ] ; then
> +                       ERR_CNT=$((ERR_CNT + 1))
> +                       ERR_LIST="${ERR_LIST} $target"
> +               else
> +                       rm ${LOG_DIR}/${target}.ERR
> +               fi
>        fi
>
> -       TOTAL_CNT=$((TOTAL_CNT + 1))
> -
> -       ${CROSS_COMPILE}size ${BUILD_DIR}/u-boot \
> +       ${CROSS_COMPILE}size ${output_dir}/u-boot \
>                                | tee -a ${LOG_DIR}/$target.MAKELOG
>  }
>  build_targets() {
> @@ -523,7 +557,20 @@ build_targets() {
>                if [ -n "${list}" ] ; then
>                        build_targets ${list}
>                else
> -                       build_target ${t}
> +                       if [ "$BUILD_NBUILDS" -gt 1 ] ; then
> +                               build_target ${t} &
> +                       else
> +                               build_target ${t}
> +                       fi
> +
> +                       TOTAL_CNT=$((TOTAL_CNT + 1))
> +                       CURRENT_COUNT=$((CURRENT_COUNT + 1))
> +               fi
> +
> +               # Limit number of parallel builds
> +               if [ ${CURRENT_COUNT} -gt ${BUILD_NBUILDS} ] ; then
> +                       CURRENT_COUNT=0;
> +                       wait;
>                fi
>        done
>  }
> @@ -531,6 +578,12 @@ build_targets() {
>  #-----------------------------------------------------------------------
>
>  print_stats() {
> +       if [ "$BUILD_NBUILDS" -gt 1 ] ; then
> +               for file in `ls ${OUTPUT_PREFIX}/ERR/`; do
> +                       ERR_LIST="${ERR_LIST} $file"
> +               done
> +               ERR_CNT=`ls -l ${OUTPUT_PREFIX}/ERR/ | wc | awk '{print $1}'`
> +       fi
>        echo ""
>        echo "--------------------- SUMMARY ----------------------------"
>        echo "Boards compiled: ${TOTAL_CNT}"
> @@ -549,3 +602,4 @@ set -- ${SELECTED} "$@"
>  # run PowerPC by default
>  [ $# = 0 ] && set -- powerpc
>  build_targets "$@"
> +wait
> --
> 1.7.3.4
>
>
> _______________________________________________
> U-Boot mailing list
> U-Boot@lists.denx.de
> http://lists.denx.de/mailman/listinfo/u-boot
>

Kumar Gala Nov. 8, 2011, 1:48 p.m. UTC | #2

On Nov 3, 2011, at 11:22 AM, Simon Glass wrote:

> Hi Andy,
> 
> On Thu, Nov 3, 2011 at 12:28 AM, Andy Fleming <afleming@freescale.com> wrote:
>> The MAKEALL script cleverly runs make with the appropriate options
>> to use all of the cores on the system, but your average U-Boot build
>> can't make much use of more than a few cores.  If you happen to have
>> a many-core server, your builds will leave most of the system idle.
>> 
>> In order to make full use of such a system, we need to build multiple
>> targets in parallel, and this requires directing make output into
>> multiple directories. We add a BUILD_NBUILDS variable, which allows
>> users to specify how many builds to run in parallel. I've found that
>> 16 is too many on my system (fork starts to fail). When BUILD_NBUILDS
>> is set greater than 1, we redefine BUILD_DIR for each build to be
>> ${BUILD_DIR}/${target}. Also, we make "./build" the default BUILD_DIR
>> when BUILD_NBUILDS is greater than 1.
>> 
>> Once each build finishes, we run make clean on its directory, to reduce
>> the footprint, and also remove all .depend files.
>> 
>> As a result, we are left with a build directory with all of the built
>> targets still there for use, which means anyone who wanted to use
>> MAKEALL as part of a test harness can now do so.
>> 
>> Signed-off-by: Andy Fleming <afleming@freescale.com>
>> ---
>> Inspired by all the MAKEALL improvements, I decided to clean up my old
>> one for parallel builds. I think this version addresses the concerns
>> raised last time...
> 

I see an issue if I have a build in working tree already.  I think we might need a distclean before starting

I also it seems this does things in clumps of BUILD_NBUILDS.  What I mean is, if BUILD_NBUILDS is 4, you do 4 and wait for all 4 to finish before starting the 5th job once the 1st is complete.  Anything we can do about that to always try and have 4 builds going until the end?

- k

Simon Glass Nov. 20, 2011, 6:11 a.m. UTC | #3

Hi Kumar,

On Tue, Nov 8, 2011 at 5:48 AM, Kumar Gala <galak@kernel.crashing.org> wrote:
>
> On Nov 3, 2011, at 11:22 AM, Simon Glass wrote:
>
>> Hi Andy,
>>
>> On Thu, Nov 3, 2011 at 12:28 AM, Andy Fleming <afleming@freescale.com> wrote:
>>> The MAKEALL script cleverly runs make with the appropriate options
>>> to use all of the cores on the system, but your average U-Boot build
>>> can't make much use of more than a few cores.  If you happen to have
>>> a many-core server, your builds will leave most of the system idle.
>>>
>>> In order to make full use of such a system, we need to build multiple
>>> targets in parallel, and this requires directing make output into
>>> multiple directories. We add a BUILD_NBUILDS variable, which allows
>>> users to specify how many builds to run in parallel. I've found that
>>> 16 is too many on my system (fork starts to fail). When BUILD_NBUILDS
>>> is set greater than 1, we redefine BUILD_DIR for each build to be
>>> ${BUILD_DIR}/${target}. Also, we make "./build" the default BUILD_DIR
>>> when BUILD_NBUILDS is greater than 1.
>>>
>>> Once each build finishes, we run make clean on its directory, to reduce
>>> the footprint, and also remove all .depend files.
>>>
>>> As a result, we are left with a build directory with all of the built
>>> targets still there for use, which means anyone who wanted to use
>>> MAKEALL as part of a test harness can now do so.
>>>
>>> Signed-off-by: Andy Fleming <afleming@freescale.com>
>>> ---
>>> Inspired by all the MAKEALL improvements, I decided to clean up my old
>>> one for parallel builds. I think this version addresses the concerns
>>> raised last time...
>>
>
> I see an issue if I have a build in working tree already.  I think we might need a distclean before starting
>
> I also it seems this does things in clumps of BUILD_NBUILDS.  What I mean is, if BUILD_NBUILDS is 4, you do 4 and wait for all 4 to finish before starting the 5th job once the 1st is complete.  Anything we can do about that to always try and have 4 builds going until the end?
>
> - k

What is the issue you see? Given that it uses an empty output dir for
each build, I can't see what could be wrong.

I am seeing a strange issue which is probably my fault. In a whole
MAKEALL maybe one build (always mx31 so far) fails with errors like
this:

$ more logs/mx31pdk.ERR
mv: cannot stat `/c/cosarm/buildall/u-boot/build/mx31pdk/include/autoconf.mk.tmp
': No such file or directory
/c/cosarm/buildall/u-boot/include/timestamp.h:27: fatal error: generated/timesta
mp_autogenerated.h: No such file or directory
compilation terminated.
make[1]: *** No rule to make target `/c/cosarm/buildall/u-boot/build/mx31pdk/arc
h/arm/cpu/arm1136/.depend', needed by `_depend'.  Stop.
In file included from /c/cosarm/buildall/u-boot/include/version.h:27:0,
                 from mkimage.c:26:
/c/cosarm/buildall/u-boot/include/timestamp.h:27:47: fatal error: generated/time
stamp_autogenerated.h: No such file or directory
compilation terminated.


/c/cosarm/buildall/u-boot/include/timestamp.h:27: fatal error: generated/timesta
mp_autogenerated.h: No such file or directory
compilation terminated.
make[1]: *** No rule to make target `/c/cosarm/buildall/u-boot/build/imx31_phyco
re/arch/arm/cpu/arm1136/.depend', needed by `_depend'.  Stop.
In file included from /c/cosarm/buildall/u-boot/include/version.h:27:0,
                 from mkimage.c:26:
/c/cosarm/buildall/u-boot/include/timestamp.h:27:47: fatal error: generated/time
stamp_autogenerated.h: No such file or directory
compilation terminated.
make[1]: *** [/c/cosarm/buildall/u-boot/build/imx31_phycore/tools/mkimage.o] Err
or 1
make: *** [tools] Error 2


I really can't see that Andy's change can affect this, but perhaps it
provokes an existing bug by changing the timing. I see the issue when
building 16 U-Boots at a time.

I haven't looked into it in detail - maybe there is a dependency
missing in the Makefile. But I would have thought that this line would
ensure that these files exist before anything else is done:

depend dep:	$(TIMESTAMP_FILE) $(VERSION_FILE) \
...

Regards,
Simon

Kumar Gala Nov. 20, 2011, 3:45 p.m. UTC | #4

On Nov 20, 2011, at 12:11 AM, Simon Glass wrote:

> Hi Kumar,
> 
> On Tue, Nov 8, 2011 at 5:48 AM, Kumar Gala <galak@kernel.crashing.org> wrote:
>> 
>> On Nov 3, 2011, at 11:22 AM, Simon Glass wrote:
>> 
>>> Hi Andy,
>>> 
>>> On Thu, Nov 3, 2011 at 12:28 AM, Andy Fleming <afleming@freescale.com> wrote:
>>>> The MAKEALL script cleverly runs make with the appropriate options
>>>> to use all of the cores on the system, but your average U-Boot build
>>>> can't make much use of more than a few cores.  If you happen to have
>>>> a many-core server, your builds will leave most of the system idle.
>>>> 
>>>> In order to make full use of such a system, we need to build multiple
>>>> targets in parallel, and this requires directing make output into
>>>> multiple directories. We add a BUILD_NBUILDS variable, which allows
>>>> users to specify how many builds to run in parallel. I've found that
>>>> 16 is too many on my system (fork starts to fail). When BUILD_NBUILDS
>>>> is set greater than 1, we redefine BUILD_DIR for each build to be
>>>> ${BUILD_DIR}/${target}. Also, we make "./build" the default BUILD_DIR
>>>> when BUILD_NBUILDS is greater than 1.
>>>> 
>>>> Once each build finishes, we run make clean on its directory, to reduce
>>>> the footprint, and also remove all .depend files.
>>>> 
>>>> As a result, we are left with a build directory with all of the built
>>>> targets still there for use, which means anyone who wanted to use
>>>> MAKEALL as part of a test harness can now do so.
>>>> 
>>>> Signed-off-by: Andy Fleming <afleming@freescale.com>
>>>> ---
>>>> Inspired by all the MAKEALL improvements, I decided to clean up my old
>>>> one for parallel builds. I think this version addresses the concerns
>>>> raised last time...
>>> 
>> 
>> I see an issue if I have a build in working tree already.  I think we might need a distclean before starting
>> 
>> I also it seems this does things in clumps of BUILD_NBUILDS.  What I mean is, if BUILD_NBUILDS is 4, you do 4 and wait for all 4 to finish before starting the 5th job once the 1st is complete.  Anything we can do about that to always try and have 4 builds going until the end?
>> 
>> - k
> 
> What is the issue you see? Given that it uses an empty output dir for
> each build, I can't see what could be wrong.

I think if I have an existing build in my source tree it screws up between that and the O= build.

> I am seeing a strange issue which is probably my fault. In a whole
> MAKEALL maybe one build (always mx31 so far) fails with errors like
> this:
> 
> $ more logs/mx31pdk.ERR
> mv: cannot stat `/c/cosarm/buildall/u-boot/build/mx31pdk/include/autoconf.mk.tmp
> ': No such file or directory
> /c/cosarm/buildall/u-boot/include/timestamp.h:27: fatal error: generated/timesta
> mp_autogenerated.h: No such file or directory
> compilation terminated.
> make[1]: *** No rule to make target `/c/cosarm/buildall/u-boot/build/mx31pdk/arc
> h/arm/cpu/arm1136/.depend', needed by `_depend'.  Stop.
> In file included from /c/cosarm/buildall/u-boot/include/version.h:27:0,
>                 from mkimage.c:26:
> /c/cosarm/buildall/u-boot/include/timestamp.h:27:47: fatal error: generated/time
> stamp_autogenerated.h: No such file or directory
> compilation terminated.
> 
> 
> /c/cosarm/buildall/u-boot/include/timestamp.h:27: fatal error: generated/timesta
> mp_autogenerated.h: No such file or directory
> compilation terminated.
> make[1]: *** No rule to make target `/c/cosarm/buildall/u-boot/build/imx31_phyco
> re/arch/arm/cpu/arm1136/.depend', needed by `_depend'.  Stop.
> In file included from /c/cosarm/buildall/u-boot/include/version.h:27:0,
>                 from mkimage.c:26:
> /c/cosarm/buildall/u-boot/include/timestamp.h:27:47: fatal error: generated/time
> stamp_autogenerated.h: No such file or directory
> compilation terminated.
> make[1]: *** [/c/cosarm/buildall/u-boot/build/imx31_phycore/tools/mkimage.o] Err
> or 1
> make: *** [tools] Error 2
> 
> 
> I really can't see that Andy's change can affect this, but perhaps it
> provokes an existing bug by changing the timing. I see the issue when
> building 16 U-Boots at a time.
> 
> I haven't looked into it in detail - maybe there is a dependency
> missing in the Makefile. But I would have thought that this line would
> ensure that these files exist before anything else is done:
> 
> depend dep:	$(TIMESTAMP_FILE) $(VERSION_FILE) \
> ...
> 
> Regards,
> Simon

Mike Frysinger Nov. 20, 2011, 9:23 p.m. UTC | #5

On Thursday 03 November 2011 03:28:29 Andy Fleming wrote:
> +if [ ! "${BUILD_NBUILDS}" ] ; then
> +	BUILD_NBUILDS="1"
> +fi

: ${BUILD_NBUILDS:=1}

> +	if [ ! "${BUILD_DIR}" ] ; then
> +		BUILD_DIR="./build"
> +	fi

: ${BUILD_DIR:=./build}

> +	mkdir -p ${BUILD_DIR}/ERR

mkdir -p "${BUILD_DIR}/ERR"

>  if [ ! "${BUILD_DIR}" ] ; then
>  	BUILD_DIR="."
>  fi

: ${BUILD_DIR:=.}

> +		mkdir -p ${output_dir}

mkdir -p "${output_dir}"

> +		${MAKE} clean
> +		find "${output_dir}" -type f -name '*.depend' | xargs rm

why not use distclean and avoid the `find` ?  otherwise, this find should be:
	find "${output_dir}" -type f -name '*.depend' -exec rm -f {} +

> +			ERR_CNT=$((ERR_CNT + 1))

	: $(( ERR_CNT += 1 ))

> -			build_target ${t}
> +			if [ "$BUILD_NBUILDS" -gt 1 ] ; then
> +				build_target ${t} &
> +			else
> +				build_target ${t}
> +			fi

you could keep this one line as:
	build_target ${t} &
	[ ${BUILD_NBUILDS} -eq 1 ] && wait

> +			TOTAL_CNT=$((TOTAL_CNT + 1))

: $(( TOTAL_CNT += 1 ))

> +			CURRENT_COUNT=$((CURRENT_COUNT + 1))

: $(( CURRENT_COUNT += 1 ))

> +		# Limit number of parallel builds
> +		if [ ${CURRENT_COUNT} -gt ${BUILD_NBUILDS} ] ; then
> +			CURRENT_COUNT=0;
> +			wait;
>  		fi

you don't need those semicolons.  also, this is not as good as it should be: 
if you're running 10 jobs in parallel, you fork 10, and then you wait for all 
of them to finish before forking another set of 10.

what you could do is something like:
	JOB_IDX=0
	JOB_IDX_FIRST=0
	BUILD_NBUILDS=1

	... foreach target ... ; do
		build_target &
		pids[$(( JOB_IDX++ ))]=$!
		if [ $(( JOB_IDX - JOB_IDX_FIRST )) -ge ${BUILD_NBUILDS} ] ; then
			wait ${pids[$(( JOB_IDX_FIRST++ ))]}
		fi
	done
	wait

this isn't perfect as it assumes the first job launched will always finish first, 
but it's a lot closer than the current code.

> +		for file in `ls ${OUTPUT_PREFIX}/ERR/`; do
> +			ERR_LIST="${ERR_LIST} $file"
> +		done

ERR_LIST=$(ls ${OUTPUT_PREFIX}/ERR/)

> +		ERR_CNT=`ls -l ${OUTPUT_PREFIX}/ERR/ | wc | awk '{print $1}'`

you should use -1 instead of -l
-mike

Simon Glass Nov. 21, 2011, 5:15 a.m. UTC | #6

Hi Mike,

On Sun, Nov 20, 2011 at 1:23 PM, Mike Frysinger <vapier@gentoo.org> wrote:
> On Thursday 03 November 2011 03:28:29 Andy Fleming wrote:
>> +if [ ! "${BUILD_NBUILDS}" ] ; then
>> +     BUILD_NBUILDS="1"
>> +fi
>
> : ${BUILD_NBUILDS:=1}
>
>> +     if [ ! "${BUILD_DIR}" ] ; then
>> +             BUILD_DIR="./build"
>> +     fi
>
> : ${BUILD_DIR:=./build}
>
>> +     mkdir -p ${BUILD_DIR}/ERR
>
> mkdir -p "${BUILD_DIR}/ERR"
>
>>  if [ ! "${BUILD_DIR}" ] ; then
>>       BUILD_DIR="."
>>  fi
>
> : ${BUILD_DIR:=.}
>
>> +             mkdir -p ${output_dir}
>
> mkdir -p "${output_dir}"
>
>> +             ${MAKE} clean
>> +             find "${output_dir}" -type f -name '*.depend' | xargs rm
>
> why not use distclean and avoid the `find` ?  otherwise, this find should be:
>        find "${output_dir}" -type f -name '*.depend' -exec rm -f {} +
>
>> +                     ERR_CNT=$((ERR_CNT + 1))
>
>        : $(( ERR_CNT += 1 ))
>
>> -                     build_target ${t}
>> +                     if [ "$BUILD_NBUILDS" -gt 1 ] ; then
>> +                             build_target ${t} &
>> +                     else
>> +                             build_target ${t}
>> +                     fi
>
> you could keep this one line as:
>        build_target ${t} &
>        [ ${BUILD_NBUILDS} -eq 1 ] && wait
>
>> +                     TOTAL_CNT=$((TOTAL_CNT + 1))
>
> : $(( TOTAL_CNT += 1 ))
>
>> +                     CURRENT_COUNT=$((CURRENT_COUNT + 1))
>
> : $(( CURRENT_COUNT += 1 ))
>
>> +             # Limit number of parallel builds
>> +             if [ ${CURRENT_COUNT} -gt ${BUILD_NBUILDS} ] ; then
>> +                     CURRENT_COUNT=0;
>> +                     wait;
>>               fi
>
> you don't need those semicolons.  also, this is not as good as it should be:
> if you're running 10 jobs in parallel, you fork 10, and then you wait for all
> of them to finish before forking another set of 10.
>
> what you could do is something like:
>        JOB_IDX=0
>        JOB_IDX_FIRST=0
>        BUILD_NBUILDS=1
>
>        ... foreach target ... ; do
>                build_target &
>                pids[$(( JOB_IDX++ ))]=$!
>                if [ $(( JOB_IDX - JOB_IDX_FIRST )) -ge ${BUILD_NBUILDS} ] ; then
>                        wait ${pids[$(( JOB_IDX_FIRST++ ))]}
>                fi
>        done
>        wait
>
> this isn't perfect as it assumes the first job launched will always finish first,
> but it's a lot closer than the current code.

Since all the jobs are launched at the same time this is not really a
valid assumption is it? IMO on this point it's good enough as it is
for now...

Regards,
Simon

>
>> +             for file in `ls ${OUTPUT_PREFIX}/ERR/`; do
>> +                     ERR_LIST="${ERR_LIST} $file"
>> +             done
>
> ERR_LIST=$(ls ${OUTPUT_PREFIX}/ERR/)
>
>> +             ERR_CNT=`ls -l ${OUTPUT_PREFIX}/ERR/ | wc | awk '{print $1}'`
>
> you should use -1 instead of -l
> -mike
>
> _______________________________________________
> U-Boot mailing list
> U-Boot@lists.denx.de
> http://lists.denx.de/mailman/listinfo/u-boot
>
>

Andy Fleming Nov. 21, 2011, 6:28 a.m. UTC | #7

On Sun, Nov 20, 2011 at 11:15 PM, Simon Glass <sjg@chromium.org> wrote:
> Hi Mike,
>
> On Sun, Nov 20, 2011 at 1:23 PM, Mike Frysinger <vapier@gentoo.org> wrote:
>> On Thursday 03 November 2011 03:28:29 Andy Fleming wrote:
>>
>> you don't need those semicolons.  also, this is not as good as it should be:
>> if you're running 10 jobs in parallel, you fork 10, and then you wait for all
>> of them to finish before forking another set of 10.
>>
>> what you could do is something like:
>>        JOB_IDX=0
>>        JOB_IDX_FIRST=0
>>        BUILD_NBUILDS=1
>>
>>        ... foreach target ... ; do
>>                build_target &
>>                pids[$(( JOB_IDX++ ))]=$!
>>                if [ $(( JOB_IDX - JOB_IDX_FIRST )) -ge ${BUILD_NBUILDS} ] ; then
>>                        wait ${pids[$(( JOB_IDX_FIRST++ ))]}
>>                fi
>>        done
>>        wait
>>
>> this isn't perfect as it assumes the first job launched will always finish first,
>> but it's a lot closer than the current code.
>
> Since all the jobs are launched at the same time this is not really a
> valid assumption is it? IMO on this point it's good enough as it is
> for now...

Mike's idea led me to a slightly crazier one that (I think) works.
Basically, I have each build_targets invocation create a file named
LOG/build_N when it is done, where N is the value of TOTAL_CNT when it
is invoked. Then I keep track of the oldest uncompleted build #. When
we hit our build limit, the script scans for LOG/build_N, for N from
oldest to TOTAL_CNT, and updates the count of outstanding builds, as
appropriate (and moves the index of the oldest). If the scanning
completes without freeing up room for another build, it waits for a
second, and then re-scans.

I tried this on my 24-core system with BUILD_NBUILDS set to 8 and
BUILD_NCPUS set to 4, and pegged the system at ~24-30 the whole time.

I will note that it only shaved about a minute off the 15 minutes it
took to build all 85xx. I'm still debugging a bit (I want to try 16
builds/2 cores, and clean up a bit), but I figured I'd ask what people
thought of this approach.

My next patch will also address Mike's comments. If I'm feeling
inspired, I might address some of your suggestions (I think I can use
the build-completion files to coalesce some of the debug output).

Andy

Andy Fleming Nov. 21, 2011, 6:56 p.m. UTC | #8

>> +             ${MAKE} clean
>> +             find "${output_dir}" -type f -name '*.depend' | xargs rm
>
> why not use distclean and avoid the `find` ?  otherwise, this find should be:
>        find "${output_dir}" -type f -name '*.depend' -exec rm -f {} +


distclean removes the u-boot executables. I thought it would be useful
to keep them around so one could theoretically test all of the systems
one built. I run make clean (which doesn't delete the results) and
delete the .depend files because otherwise one can very quickly fill
up a disk with ./MAKEALL powerpc

Mike Frysinger Nov. 21, 2011, 7 p.m. UTC | #9

On Monday 21 November 2011 00:15:21 Simon Glass wrote:
> On Sun, Nov 20, 2011 at 1:23 PM, Mike Frysinger wrote:
> > On Thursday 03 November 2011 03:28:29 Andy Fleming wrote:
> >> +             # Limit number of parallel builds
> >> +             if [ ${CURRENT_COUNT} -gt ${BUILD_NBUILDS} ] ; then
> >> +                     CURRENT_COUNT=0;
> >> +                     wait;
> >>               fi
> > 
> > you don't need those semicolons.  also, this is not as good as it should
> > be: if you're running 10 jobs in parallel, you fork 10, and then you
> > wait for all of them to finish before forking another set of 10.
> > 
> > what you could do is something like:
> >        JOB_IDX=0
> >        JOB_IDX_FIRST=0
> >        BUILD_NBUILDS=1
> > 
> >        ... foreach target ... ; do
> >                build_target &
> >                pids[$(( JOB_IDX++ ))]=$!
> >                if [ $(( JOB_IDX - JOB_IDX_FIRST )) -ge ${BUILD_NBUILDS} ]
> > ; then wait ${pids[$(( JOB_IDX_FIRST++ ))]}
> >                fi
> >        done
> >        wait
> > 
> > this isn't perfect as it assumes the first job launched will always
> > finish first, but it's a lot closer than the current code.
> 
> Since all the jobs are launched at the same time this is not really a
> valid assumption is it? IMO on this point it's good enough as it is
> for now...

my point was, it's about as valid as you can get in bash.  you can't do non-
blocking wait's in bash, nor can you wait on a list of pids and have it return 
only one at a time.  the launching-jobs-at-the-same-time isn't really the 
worse part: builds can vary greatly in terms of how much code they actually 
compile.  some boards are very small while others are quite fat.

so your choices are:
	- make a general "good enough" assumption (which i did here)
	- have the main thread manually poll every child through some IPC mech 
such as looking for files the children touch when they're done, and *then* do 
the wait explicitly on that child

the problem with the latter is that the code tends to get much more 
complicated, and the main thread ends up doing quite a bit of busy work which 
kills CPU cycles.  my original assumption ends up keeping the CPU's fairly 
loaded i think, and is fairly simplistic code.
-mike

Mike Frysinger Nov. 21, 2011, 7:01 p.m. UTC | #10

On Monday 21 November 2011 01:28:50 Andy Fleming wrote:
> On Sun, Nov 20, 2011 at 11:15 PM, Simon Glass wrote:
> > On Sun, Nov 20, 2011 at 1:23 PM, Mike Frysinger wrote:
> >> On Thursday 03 November 2011 03:28:29 Andy Fleming wrote:
> >> 
> >> you don't need those semicolons.  also, this is not as good as it should
> >> be: if you're running 10 jobs in parallel, you fork 10, and then you
> >> wait for all of them to finish before forking another set of 10.
> >> 
> >> what you could do is something like:
> >>        JOB_IDX=0
> >>        JOB_IDX_FIRST=0
> >>        BUILD_NBUILDS=1
> >> 
> >>        ... foreach target ... ; do
> >>                build_target &
> >>                pids[$(( JOB_IDX++ ))]=$!
> >>                if [ $(( JOB_IDX - JOB_IDX_FIRST )) -ge ${BUILD_NBUILDS}
> >> ] ; then wait ${pids[$(( JOB_IDX_FIRST++ ))]}
> >>                fi
> >>        done
> >>        wait
> >> 
> >> this isn't perfect as it assumes the first job launched will always
> >> finish first, but it's a lot closer than the current code.
> > 
> > Since all the jobs are launched at the same time this is not really a
> > valid assumption is it? IMO on this point it's good enough as it is
> > for now...
> 
> Mike's idea led me to a slightly crazier one that (I think) works.
> Basically, I have each build_targets invocation create a file named
> LOG/build_N when it is done, where N is the value of TOTAL_CNT when it
> is invoked. Then I keep track of the oldest uncompleted build #. When
> we hit our build limit, the script scans for LOG/build_N, for N from
> oldest to TOTAL_CNT, and updates the count of outstanding builds, as
> appropriate (and moves the index of the oldest). If the scanning
> completes without freeing up room for another build, it waits for a
> second, and then re-scans.

sounds like it would work, but i'd want to see how much code it takes to 
actually implement this

> I tried this on my 24-core system with BUILD_NBUILDS set to 8 and
> BUILD_NCPUS set to 4, and pegged the system at ~24-30 the whole time.
> 
> I will note that it only shaved about a minute off the 15 minutes it
> took to build all 85xx.

compare to what ?  your original patch, or my tweaked version ?
-mike

Andy Fleming Nov. 21, 2011, 7:03 p.m. UTC | #11

On Mon, Nov 21, 2011 at 1:01 PM, Mike Frysinger <vapier@gentoo.org> wrote:
> On Monday 21 November 2011 01:28:50 Andy Fleming wrote:
>> On Sun, Nov 20, 2011 at 11:15 PM, Simon Glass wrote:
>> > On Sun, Nov 20, 2011 at 1:23 PM, Mike Frysinger wrote:
>> >> On Thursday 03 November 2011 03:28:29 Andy Fleming wrote:
>> >>
>> >> you don't need those semicolons.  also, this is not as good as it should
>> >> be: if you're running 10 jobs in parallel, you fork 10, and then you
>> >> wait for all of them to finish before forking another set of 10.
>> >>
>> >> what you could do is something like:
>> >>        JOB_IDX=0
>> >>        JOB_IDX_FIRST=0
>> >>        BUILD_NBUILDS=1
>> >>
>> >>        ... foreach target ... ; do
>> >>                build_target &
>> >>                pids[$(( JOB_IDX++ ))]=$!
>> >>                if [ $(( JOB_IDX - JOB_IDX_FIRST )) -ge ${BUILD_NBUILDS}
>> >> ] ; then wait ${pids[$(( JOB_IDX_FIRST++ ))]}
>> >>                fi
>> >>        done
>> >>        wait
>> >>
>> >> this isn't perfect as it assumes the first job launched will always
>> >> finish first, but it's a lot closer than the current code.
>> >
>> > Since all the jobs are launched at the same time this is not really a
>> > valid assumption is it? IMO on this point it's good enough as it is
>> > for now...
>>
>> Mike's idea led me to a slightly crazier one that (I think) works.
>> Basically, I have each build_targets invocation create a file named
>> LOG/build_N when it is done, where N is the value of TOTAL_CNT when it
>> is invoked. Then I keep track of the oldest uncompleted build #. When
>> we hit our build limit, the script scans for LOG/build_N, for N from
>> oldest to TOTAL_CNT, and updates the count of outstanding builds, as
>> appropriate (and moves the index of the oldest). If the scanning
>> completes without freeing up room for another build, it waits for a
>> second, and then re-scans.
>
> sounds like it would work, but i'd want to see how much code it takes to
> actually implement this
>
>> I tried this on my 24-core system with BUILD_NBUILDS set to 8 and
>> BUILD_NCPUS set to 4, and pegged the system at ~24-30 the whole time.
>>
>> I will note that it only shaved about a minute off the 15 minutes it
>> took to build all 85xx.
>
> compare to what ?  your original patch, or my tweaked version ?

Compared to my original. However, I think that may be due to my
configuration. 8 builds with -j4 makes was 14, but 16 builds with -j2
resulted in just over 10 minutes.  I'm doing more testing.

Andy

Mike Frysinger Nov. 21, 2011, 7:11 p.m. UTC | #12

On Monday 21 November 2011 13:56:16 Andy Fleming wrote:
> >> +             ${MAKE} clean
> >> +             find "${output_dir}" -type f -name '*.depend' | xargs rm
> > 
> > why not use distclean and avoid the `find` ?  otherwise, this find should
> > be: find "${output_dir}" -type f -name '*.depend' -exec rm -f {} +
> 
> distclean removes the u-boot executables. I thought it would be useful
> to keep them around so one could theoretically test all of the systems
> one built. I run make clean (which doesn't delete the results) and
> delete the .depend files because otherwise one can very quickly fill
> up a disk with ./MAKEALL powerpc

perhaps add a new clean target that does this ("mostlyclean") ?  what you want 
sounds generally useful to me ...
-mike

Simon Glass Nov. 21, 2011, 7:29 p.m. UTC | #13

Hi Mike,

On Mon, Nov 21, 2011 at 11:00 AM, Mike Frysinger <vapier@gentoo.org> wrote:
> On Monday 21 November 2011 00:15:21 Simon Glass wrote:
>> On Sun, Nov 20, 2011 at 1:23 PM, Mike Frysinger wrote:
>> > On Thursday 03 November 2011 03:28:29 Andy Fleming wrote:
>> >> +             # Limit number of parallel builds
>> >> +             if [ ${CURRENT_COUNT} -gt ${BUILD_NBUILDS} ] ; then
>> >> +                     CURRENT_COUNT=0;
>> >> +                     wait;
>> >>               fi
>> >
>> > you don't need those semicolons.  also, this is not as good as it should
>> > be: if you're running 10 jobs in parallel, you fork 10, and then you
>> > wait for all of them to finish before forking another set of 10.
>> >
>> > what you could do is something like:
>> >        JOB_IDX=0
>> >        JOB_IDX_FIRST=0
>> >        BUILD_NBUILDS=1
>> >
>> >        ... foreach target ... ; do
>> >                build_target &
>> >                pids[$(( JOB_IDX++ ))]=$!
>> >                if [ $(( JOB_IDX - JOB_IDX_FIRST )) -ge ${BUILD_NBUILDS} ]
>> > ; then wait ${pids[$(( JOB_IDX_FIRST++ ))]}
>> >                fi
>> >        done
>> >        wait
>> >
>> > this isn't perfect as it assumes the first job launched will always
>> > finish first, but it's a lot closer than the current code.
>>
>> Since all the jobs are launched at the same time this is not really a
>> valid assumption is it? IMO on this point it's good enough as it is
>> for now...
>
> my point was, it's about as valid as you can get in bash.  you can't do non-
> blocking wait's in bash, nor can you wait on a list of pids and have it return
> only one at a time.  the launching-jobs-at-the-same-time isn't really the
> worse part: builds can vary greatly in terms of how much code they actually
> compile.  some boards are very small while others are quite fat.
>
> so your choices are:
>        - make a general "good enough" assumption (which i did here)
>        - have the main thread manually poll every child through some IPC mech
> such as looking for files the children touch when they're done, and *then* do
> the wait explicitly on that child

You missed one:

- write it in Python :-)

(which I might try if patman is accepted)

>
> the problem with the latter is that the code tends to get much more
> complicated, and the main thread ends up doing quite a bit of busy work which
> kills CPU cycles.  my original assumption ends up keeping the CPU's fairly
> loaded i think, and is fairly simplistic code.
> -mike
>

It should be easy enough in Python to just maintain 20 threads (or
whatever) and start the next job when a thread becomes idle.

Regards,
Simon

Mike Frysinger Nov. 21, 2011, 8:13 p.m. UTC | #14

On Monday 21 November 2011 14:29:35 Simon Glass wrote:
> - write it in Python :-)

yeah, i'm not interested in doing that.  imo, these build scripts should 
remain in shell land.  i don't even like things being moved to perl.
-mike

Simon Glass Nov. 21, 2011, 8:33 p.m. UTC | #15

Hi Mike,

On Mon, Nov 21, 2011 at 12:13 PM, Mike Frysinger <vapier@gentoo.org> wrote:
> On Monday 21 November 2011 14:29:35 Simon Glass wrote:
>> - write it in Python :-)
>
> yeah, i'm not interested in doing that.  imo, these build scripts should
> remain in shell land.  i don't even like things being moved to perl.
> -mike
>

That's fine with me, iwc I think Andy's solution is about as far as we
want to go in bash-land.

Regards,
Simon

[U-Boot] Allow for parallel builds and saved output

Commit Message

Comments

Patch