diff mbox

gcc parallel make check

Message ID 20140911075123.GN17454@tucnak.redhat.com
State New
Headers show

Commit Message

Jakub Jelinek Sept. 11, 2014, 7:51 a.m. UTC
On Wed, Sep 10, 2014 at 11:23:34PM +0200, Jakub Jelinek wrote:
> On Wed, Sep 10, 2014 at 11:08:22PM +0200, Jakub Jelinek wrote:
> > Perhaps better approach might be if we have some way how to synchronize among
> > multiple expect processes and spawn only as many expects (of course, per
> > check target) as there are CPUs.  E.g. if mkdir is atomic on all
> > hosts/filesystems we care about, we could have some shared directory that
> > make would clear before spawning all the expects, and after checking
> > runtest_file_p we could attempt to mkdir something (e.g. testcase filename
> > with $(srcdir) part removed, or *.exp filename / counter what test are we
> > considering or something similar) in the shared directory, if that would
> > succeed, it would tell us that we are the process that should run the test,
> > if that failed, we'd know some other runtest did that.
> > Or perhaps not for every single test, but every 10 or 100 tests or
> > something.
> > 
> > E.g. we could just override runtest_file_p itself, so that it would first
> > call the original dejagnu version, and then do this check.
> 
> Seems file mkdir in tcl doesn't error on pre-existing directory, so perhaps
> [open $path {WRONLY EXCL CREAT}] ?
> Now, does this work properly on all hosts we care about?

Here is a proof of concept on the tcl side.
To get a large seq of numbers in the Makefile, I guess we can use something
like
check_p_numbers0:=1 2 3 4 5 6 7 8 9  
check_p_numbers1:=0 $(check_p_numbers0)
check_p_numbers2:=$(foreach i,$(check_p_numbers0),$(patsubst %,$(i)%,$(check_p_numbers1)))
check_p_numbers3:=$(patsubst %,0%,$(check_p_numbers1)) $(check_p_numbers2)
check_p_numbers4:=$(foreach i,$(check_p_numbers0),$(patsubst %,$(i)%,$(check_p_numbers3)))
check_p_numbers5:=$(patsubst %,0%,$(check_p_numbers3)) $(check_p_numbers4)
check_p_numbers6:=$(foreach i,$(check_p_numbers0),$(patsubst %,$(i)%,$(check_p_numbers5)))
check_p_numbers:=$(check_p_numbers0) $(check_p_numbers2) $(check_p_numbers4) $(check_p_numbers6)
(and then what
check_p_subdirs=$(wordlist 1,$(words $(check_$*_parallelize)),$(check_p_numbers))
uses, just with $(check_$*_parallelize) replaced with something to match the
number of desired goals.
Looking at some of the *.exp tests, it seems only some of them (though, the
majority of the time consuming ones) actually use runtest_file_p, e.g.
compat.exp or struct-layout-1.exp and several others don't.

So, IMHO what we should do in the Makefile is, right inside
        @if [ -z "$(filter-out --target_board=%,$(filter-out --extra_opts%,$(RUNTESTFLAGS)))" ] \
            && [ "$(filter -j, $(MFLAGS))" = "-j" ]; then \
first rm -rf $(TESTSUITEDIR)/$*-parallel; mkdir $(TESTSUITEDIR)/$*-parallel
so that we start with empty dir, compute check_p_subdirs from actual -jN
number, then in check-parallel-gcc_1 etc. goals (but not in
check-parallel-gcc) set
GCC_RUNTEST_PARALLELIZE_DIR=$(TESTSUITEDIR)/$(check_p_tool)-parallel
in the environment and use RUNTESTFLAGS with selected known to be
parallelizable *.exp files (dg.exp execute.exp compile.exp and the like),
and use all the other *.exp files for check-parallel-gcc.

Thoughts on this?

Unfortunately, not sure how would that work with the
check-subtargets stuff if people are used to parallelize testing across
multiple machines (but it is unclear to me how they are merging the log/sum
files from the multiple machines anyway).  Not sure if this works over
NFS/AFS and other networked filesystems, if it does, supposedly they could
arrange for the *-parallel directories to be shared.

I can't find how to query the -jN value passed to make check by the user
though, both $(MFLAGS) and $(MAKEFLAGS) only contain something like
--jobserver-fds=3,5 -j from which it is not possible to find out how many
goals would be the upper reasonable limit.  Running too many goals would
waste time (once scheduled, the goal would only wildcard all the test, and
for all of them find in the *-parallel directory the test has been run
already), running too few could prevent good parallelization.



	Jakub

Comments

Jakub Jelinek Sept. 11, 2014, 8:06 a.m. UTC | #1
On Thu, Sep 11, 2014 at 09:51:23AM +0200, Jakub Jelinek wrote:
> I can't find how to query the -jN value passed to make check by the user
> though, both $(MFLAGS) and $(MAKEFLAGS) only contain something like
> --jobserver-fds=3,5 -j from which it is not possible to find out how many
> goals would be the upper reasonable limit.  Running too many goals would
> waste time (once scheduled, the goal would only wildcard all the test, and
> for all of them find in the *-parallel directory the test has been run
> already), running too few could prevent good parallelization.

After a little googling, it seems there is no way to do that :(, unless
one e.g. attempts to find the command line of the topmost parent make
and scan it through ps or something.

There is an option to touch say *-parallel/finished file once any of the
check-parallel-gcc-{1,2,...} goals is done (because when it finishes, it
means all the tests for the particular check-$lang that are parallelizable
have either finished, or at least touched their file) and not start runtest
at all if finished already exists, but guess it would be still undesirable to have
tens of thousands of goals by default, so perhaps we could go with say
128 subgoals by default and have some env var to override it, so on the
really highly parallel boxes you'd specify
make -j512 -k check GCC_TEST_PARALLEL_SLOTS=512
or similar.

	Jakub
diff mbox

Patch

--- gcc/testsuite/lib/gcc-defs.exp.jj	2014-09-01 09:43:28.000000000 +0200
+++ gcc/testsuite/lib/gcc-defs.exp	2014-09-11 08:37:43.871943270 +0200
@@ -188,6 +188,30 @@  if { [info procs runtest_file_p] == "" }
     }
 }
 
+if { [info exists env(GCC_RUNTEST_PARALLELIZE_DIR)] \
+     && [info procs runtest_file_p] != [list] \
+     && [info procs gcc_parallelize_saved_runtest_file_p] == [list] } then {
+    rename runtest_file_p gcc_parallelize_saved_runtest_file_p
+    global gcc_runtest_parallelize_counter
+
+    set gcc_runtest_parallelize_counter 0
+    proc runtest_file_p { runtests testcase } {
+	global gcc_runtest_parallelize_counter
+	if ![gcc_parallelize_saved_runtest_file_p $runtests $testcase] {
+	    return 0
+	}
+
+	set dir [getenv GCC_RUNTEST_PARALLELIZE_DIR]
+	set path $dir/$gcc_runtest_parallelize_counter
+	set gcc_runtest_parallelize_counter [expr {$gcc_runtest_parallelize_counter + 1}]
+	if {![catch {open $path {RDWR CREAT EXCL} 0600} fd]} {
+	    close $fd
+	    return 1
+	}
+	return 0
+    }
+}
+
 # Like dg-options, but adds to the default options rather than replacing them.
 
 proc dg-additional-options { args } {