From patchwork Tue Jan 2 19:56:08 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Weber X-Patchwork-Id: 854726 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=busybox.net (client-ip=140.211.166.136; helo=silver.osuosl.org; envelope-from=buildroot-bounces@busybox.net; receiver=) Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zB4ZD4kQYz9t2Q for ; Wed, 3 Jan 2018 06:56:20 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id 5850930502; Tue, 2 Jan 2018 19:56:16 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4J5i8CiXcIQ3; Tue, 2 Jan 2018 19:56:14 +0000 (UTC) Received: from ash.osuosl.org (ash.osuosl.org [140.211.166.34]) by silver.osuosl.org (Postfix) with ESMTP id 9B4F3304C1; Tue, 2 Jan 2018 19:56:14 +0000 (UTC) X-Original-To: buildroot@lists.busybox.net Delivered-To: buildroot@osuosl.org Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by ash.osuosl.org (Postfix) with ESMTP id E55A01C1F57 for ; Tue, 2 Jan 2018 19:56:12 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id E13BA88F28 for ; Tue, 2 Jan 2018 19:56:12 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qZB7XZfUySiA for ; Tue, 2 Jan 2018 19:56:11 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from da1vs03.rockwellcollins.com (da1vs03.rockwellcollins.com [205.175.227.47]) by whitealder.osuosl.org (Postfix) with ESMTPS id 849E388F21 for ; Tue, 2 Jan 2018 19:56:11 +0000 (UTC) Received: from ofwda1n02.rockwellcollins.com (HELO crulimr02.rockwellcollins.com) ([205.175.227.14]) by da1vs03.rockwellcollins.com with ESMTP; 02 Jan 2018 13:56:11 -0600 X-Received: from bacon.rockwellcollins.com (unknown [192.168.6.146]) by crulimr02.rockwellcollins.com (Postfix) with ESMTP id 40335604AD; Tue, 2 Jan 2018 13:56:10 -0600 (CST) From: Matt Weber To: buildroot@buildroot.org Date: Tue, 2 Jan 2018 13:56:08 -0600 Message-Id: <20180102195608.10806-1-matthew.weber@rockwellcollins.com> X-Mailer: git-send-email 2.14.2 Subject: [Buildroot] [PATCH v2] hung build: convert to monitor thread X-BeenThere: buildroot@busybox.net X-Mailman-Version: 2.1.24 Precedence: list List-Id: Discussion and development of buildroot List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: buildroot-bounces@busybox.net Sender: "buildroot" Check the build-time.log and monitor for modifications to determine if the build has hung for at most 60mins before killing the build and reporting a timeout. This allows infinite sized builds as we get to a lower number of autobr fails. Less failures means we start to see false timeout failures when we hit the boundary of the old MAX_DURATION ~8hrs. Signed-off-by: Matthew Weber --- Change Log v1->v2 [Thomas P. - Use mtime vs reading file - Use datetime for hung delta check - Removed camel case - Added hung build event to sync hand-off back to main thread --- scripts/autobuild-run | 53 ++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 44 insertions(+), 9 deletions(-) diff --git a/scripts/autobuild-run b/scripts/autobuild-run index 8475437..7274c80 100755 --- a/scripts/autobuild-run +++ b/scripts/autobuild-run @@ -73,6 +73,8 @@ import sys from time import localtime, strftime from distutils.version import StrictVersion import platform +from threading import Thread, Event +import datetime if sys.hexversion >= 0x3000000: import configparser @@ -167,7 +169,7 @@ else: decode_bytes = _identity encode_str = _identity -MAX_DURATION = 60 * 60 * 8 +HUNG_BUILD_TIMEOUT = 60 # mins VERSION = 1 def log_write(logf, msg): @@ -199,7 +201,7 @@ def get_branch(): return branches[randint(0, len(branches) - 1)] class SystemInfo: - DEFAULT_NEEDED_PROGS = ["make", "git", "gcc", "timeout"] + DEFAULT_NEEDED_PROGS = ["make", "git", "gcc"] DEFAULT_OPTIONAL_PROGS = ["bzr", "java", "javac", "jar"] def __init__(self): @@ -358,6 +360,24 @@ def gen_config(**kwargs): ret = subprocess.call(args, stdout=devnull, stderr=log) return ret +def stop_on_build_hang(monitor_thread_hung_build_flag, + monitor_thread_stop_flag, + sub_proc, outputdir, log): + build_time_logfile = os.path.join(outputdir, "build/build-time.log") + while True: + if monitor_thread_stop_flag.is_set(): + return + if os.path.exists(build_time_logfile): + mtime = datetime.datetime.fromtimestamp(os.stat(build_time_logfile).st_mtime) + + if mtime < datetime.datetime.now() - datetime.timedelta(minutes=HUNG_BUILD_TIMEOUT): + if sub_proc.poll() is None: + monitor_thread_hung_build_flag.set() # Used by do_build() to determine build hang + log_write(log, "INFO: build hung") + sub_proc.kill() + break + monitor_thread_stop_flag.wait(30) + def do_build(**kwargs): """Run the build itself""" @@ -380,25 +400,40 @@ def do_build(**kwargs): f = open(os.path.join(outputdir, "logfile"), "w+") log_write(log, "INFO: build started") - cmd = ["timeout", str(MAX_DURATION), - "nice", "-n", str(nice), + cmd = ["nice", "-n", str(nice), "make", "O=%s" % outputdir, "-C", srcdir, "BR2_DL_DIR=%s" % dldir, "BR2_JLEVEL=%s" % kwargs['njobs']] \ + kwargs['make_opts'].split() sub = subprocess.Popen(cmd, stdout=f, stderr=f) + + # Setup hung build monitoring thread + monitor_thread_hung_build_flag = Event() + monitor_thread_stop_flag = Event() + build_monitor = Thread(target=stop_on_build_hang, + args=(monitor_thread_hung_build_flag, + monitor_thread_stop_flag, + sub, outputdir, log)) + build_monitor.daemon = True + build_monitor.start() + kwargs['buildpid'][kwargs['instance']] = sub.pid ret = sub.wait() kwargs['buildpid'][kwargs['instance']] = 0 - # 124 is a special error code that indicates we have reached the - # timeout - if ret == 124: - log_write(log, "INFO: build timed out") + # If build failed, monitor thread would have exited at this point + if monitor_thread_hung_build_flag.is_set(): + log_write(log, "INFO: build timed out [%d]" % ret) return -2 + else: + # Stop monitor thread as this build didn't timeout + monitor_thread_stop_flag.set() + # Monitor thread should be exiting around this point + if ret != 0: - log_write(log, "INFO: build failed") + log_write(log, "INFO: build failed [%d]" % ret) return -1 + cmd = ["make", "O=%s" % outputdir, "-C", srcdir, "BR2_DL_DIR=%s" % dldir, "legal-info"] \ + kwargs['make_opts'].split()