From patchwork Sun Feb 16 21:59:08 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nathaniel Smith X-Patchwork-Id: 320812 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id A7EFB2C0086 for ; Mon, 17 Feb 2014 08:59:20 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; q=dns; s=default; b=CdX2SYDa/fQClN+0K5 dDZhNUKCr7zkcxeljsy2EAZl//XCiIG5P315vlShkWsHO3kmVOfIvm4ChEULFxTV fkgxaWl9p5UPxWEH1ZA8GC5yE52nA5BaiA/Bk7ANVr8h6Q0qwdmc4z/Z6QsGNWZX eo6hIBnU43Z+4HVeIwbB7lFGs= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; s=default; bh=6RvVkO3d0AH4JzkEWLdf/0/o 4Ss=; b=nIDTRl01tsocIoTgE5OU3HnlqY6U9GSdMJeOAED9adoDc+FaWb5k8buW m1RZ9snRGb5d9CpJsunzLVJWIh3bcyosQ2iE9yaJRvsthtx9VoRrTF/Yod/DEXc+ 8WI5mWpZg3/cgadJAf9ZL9GBc9AjmSeI7X20ShQOjQIX2mBsB/Q= Received: (qmail 16142 invoked by alias); 16 Feb 2014 21:59:13 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 16130 invoked by uid 89); 16 Feb 2014 21:59:12 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.6 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 X-HELO: mail-oa0-f46.google.com Received: from mail-oa0-f46.google.com (HELO mail-oa0-f46.google.com) (209.85.219.46) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Sun, 16 Feb 2014 21:59:10 +0000 Received: by mail-oa0-f46.google.com with SMTP id n16so16761733oag.19 for ; Sun, 16 Feb 2014 13:59:08 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=5YAFgiYf+ECOeYXSh6Hk0k0xxGEsM9O8MCLMJLLTqHI=; b=lc07Rzt87/sfVSApHldCqk8dFDHVjuTQ7gzSoEhZm6tcV5bf7MgtA1NLuQ2q+h4Xhb Z95Z4tpuLFhTpAK9/Z5hNx9o8MEsmOZyNglYbAEarRqa31qshWhZziT85dIHcAEfyzdG TFg+Hq0JbasRk4oq70YydRcUEhAJ1/uSUwqfFm4BoO0lhYh8COkFvigHZnyDAfNjPZ4/ OL5P0XcdcL3etMvTsd+o4nqpfmpr67n+Uq04NkWAV9d7cwXqyeIPmuuWBGXrJPPrcx/I VI0nnUlwuQtDxItJKDhm6CVcZJOlX/OYpPLZKTYvCgIjCMDZnr2nxyXu52h4Yn3el67Z EvFQ== X-Gm-Message-State: ALoCoQmUkxqkHx7deHqbhlFYqOT5M+CZo8nerHMVYD0Am/dvGWvmeXKv4DCdSlytuEmqKA0tknSM MIME-Version: 1.0 X-Received: by 10.182.160.102 with SMTP id xj6mr17909966obb.19.1392587948695; Sun, 16 Feb 2014 13:59:08 -0800 (PST) Received: by 10.60.15.2 with HTTP; Sun, 16 Feb 2014 13:59:08 -0800 (PST) In-Reply-To: <52FE7918.1020503@redhat.com> References: <52FD37A1.6040404@redhat.com> <20140214082124.GA20378@tucnak.redhat.com> <52FE7918.1020503@redhat.com> Date: Sun, 16 Feb 2014 16:59:08 -0500 Message-ID: Subject: Re: [PATCH] [libgomp] make it possible to use OMP on both sides of a fork From: Nathaniel Smith To: Richard Henderson Cc: Jakub Jelinek , gcc-patches@gcc.gnu.org X-IsSubscribed: yes On Fri, Feb 14, 2014 at 3:14 PM, Richard Henderson wrote: > On 02/14/2014 12:21 AM, Jakub Jelinek wrote: >>> Any reason not to just run gomp_free_thread_pool from gomp_after_fork_callback >>> directly? I see no restrictions on what kind of code is allowed to execute >>> during that callback. >> >> Well, fork is async signal safe function, so calling malloc/free, or any >> kind of synchronization primitives is completely unsafe there. > > That's as may be, but even the opengroup's rationale for pthread_atfork > mentions using locks in the three callbacks. I strongly suspect that no real > use of pthread_atfork can ever really be async safe. Yes, but the problem is that depending on what the user intends to do after forking, our pthread_atfork handler might help or it might hurt, and we don't know which. Consider these two cases: - fork+exec - fork+continue to use OMP in child The former case is totally POSIX-legal, even when performed at arbitrary places, even when another thread is, say, in the middle of calling malloc(). If we register a pthread_atfork handler which calls non-signal-safe functions, then we risk breaking POSIX-legal programs like this. The latter case is broken in current GOMP, but we would like it to work as well -- at least when possible. So the way the patch is structured the way it is, is to ensure that we have minimal impact on the former case while still giving the latter case a chance to succeed. Updated patch addressing your other comments attached. 2014-02-12 Nathaniel J. Smith * team.c (gomp_free_pool_helper): Move per-thread cleanup to main thread. (gomp_free_thread): Delegate implementation to... (gomp_free_thread_pool): ...this new function. Like old gomp_free_thread, but does per-thread cleanup, and has option to skip everything that involves interacting with actual threads, which is useful when called after fork. (gomp_after_fork_callback): New function. (gomp_team_start): Register atfork handler, and check for fork on entry. Index: team.c =================================================================== --- team.c (revision 207398) +++ team.c (working copy) @@ -28,6 +28,7 @@ #include "libgomp.h" #include #include +#include /* This attribute contains PTHREAD_CREATE_DETACHED. */ pthread_attr_t gomp_thread_attr; @@ -43,6 +44,8 @@ __thread struct gomp_thread gomp_tls_data; pthread_key_t gomp_tls_key; #endif +/* This is to enable best-effort cleanup after fork. */ +static bool gomp_we_are_forked; /* This structure is used to communicate across pthread_create. */ @@ -204,42 +207,41 @@ static struct gomp_thread_pool *gomp_new_thread_po return pool; } +/* Free a thread pool and release its threads. */ + static void gomp_free_pool_helper (void *thread_pool) { - struct gomp_thread *thr = gomp_thread (); struct gomp_thread_pool *pool = (struct gomp_thread_pool *) thread_pool; gomp_barrier_wait_last (&pool->threads_dock); - gomp_sem_destroy (&thr->release); - thr->thread_pool = NULL; - thr->task = NULL; pthread_exit (NULL); } -/* Free a thread pool and release its threads. */ - -void -gomp_free_thread (void *arg __attribute__((unused))) +static void +gomp_free_thread_pool (bool threads_are_running) { struct gomp_thread *thr = gomp_thread (); struct gomp_thread_pool *pool = thr->thread_pool; if (pool) { + int i; if (pool->threads_used > 0) { - int i; - for (i = 1; i < pool->threads_used; i++) + if (threads_are_running) { - struct gomp_thread *nthr = pool->threads[i]; - nthr->fn = gomp_free_pool_helper; - nthr->data = pool; + for (i = 1; i < pool->threads_used; i++) + { + struct gomp_thread *nthr = pool->threads[i]; + nthr->fn = gomp_free_pool_helper; + nthr->data = pool; + } + /* This barrier undocks threads docked on pool->threads_dock. */ + gomp_barrier_wait (&pool->threads_dock); + /* And this waits till all threads have called + gomp_barrier_wait_last in gomp_free_pool_helper. */ + gomp_barrier_wait (&pool->threads_dock); } - /* This barrier undocks threads docked on pool->threads_dock. */ - gomp_barrier_wait (&pool->threads_dock); - /* And this waits till all threads have called gomp_barrier_wait_last - in gomp_free_pool_helper. */ - gomp_barrier_wait (&pool->threads_dock); /* Now it is safe to destroy the barrier and free the pool. */ gomp_barrier_destroy (&pool->threads_dock); @@ -251,6 +253,14 @@ gomp_free_pool_helper (void *thread_pool) gomp_managed_threads -= pool->threads_used - 1L; gomp_mutex_unlock (&gomp_managed_threads_lock); #endif + /* Clean up thread objects */ + for (i = 1; i < pool->threads_used; i++) + { + struct gomp_thread *nthr = pool->threads[i]; + gomp_sem_destroy (&nthr->release); + nthr->thread_pool = NULL; + nthr->task = NULL; + } } free (pool->threads); if (pool->last_team) @@ -266,6 +276,58 @@ gomp_free_pool_helper (void *thread_pool) } } +/* This is called whenever a thread exits which has a non-NULL value for + gomp_thread_destructor. In practice, the only thread for which this occurs + is the one which created the thread pool. +*/ +void +gomp_free_thread (void *arg __attribute__((unused))) +{ + gomp_free_thread_pool (true); +} + +/* This is called in the child process after a fork. + + According to POSIX, if a process which uses threads calls fork(), then + there are very few things that the resulting child process can do safely -- + mostly just exec(). + + However, in practice, (almost?) all POSIX implementations seem to allow + arbitrary code to run inside the child, *if* the parent process's threads + are in a well-defined state when the fork occurs. And this circumstance can + easily arise in OMP-using programs, e.g. when a library function like DGEMM + uses OMP internally, and some other unrelated part of the program calls + fork() at some other time, when no OMP sections are running. + + Therefore, we make a best effort attempt to handle the case: + + OMP section (in parent) -> quiesce -> fork -> OMP section (in child) + + "Best-effort" here means that: + - Your system may or may not be able to handle this kind of code at all; + our goal is just to make sure that if it fails it's not gomp's fault. + - All threadprivate variables will be reset in the child. Fortunately this + is entirely compliant with the spec, according to the rule of nasal + demons. + - We must have minimal speed impact, and no correctness impact, on + compliant programs. + + We use this callback to notice when a fork has a occurred, and if the child + later attempts to enter an OMP section (via gomp_team_start), then we know + that it is non-compliant, and are free to apply our best-effort strategy of + cleaning up the old thread pool structures and spawning a new one. Because + compliant programs never call gomp_team_start after forking, they are + unaffected. +*/ +static void +gomp_after_fork_callback (void) +{ + /* Only "async-signal-safe operations" are allowed here, so let's keep it + simple. No mutex is needed, because we are currently single-threaded. + */ + gomp_we_are_forked = 1; +} + /* Launch a team. */ void @@ -288,11 +350,19 @@ gomp_team_start (void (*fn) (void *), void *data, thr = gomp_thread (); nested = thr->ts.team != NULL; + if (__builtin_expect (gomp_we_are_forked, 0)) + { + gomp_free_thread_pool (0); + gomp_we_are_forked = 0; + } if (__builtin_expect (thr->thread_pool == NULL, 0)) { thr->thread_pool = gomp_new_thread_pool (); thr->thread_pool->threads_busy = nthreads; + /* The pool should be cleaned up whenever this thread exits... */ pthread_setspecific (gomp_thread_destructor, thr); + /* ...and also in any fork()ed children. */ + pthread_atfork (NULL, NULL, gomp_after_fork_callback); } pool = thr->thread_pool; task = thr->task; Index: testsuite/libgomp.c/fork-1.c =================================================================== --- testsuite/libgomp.c/fork-1.c (revision 0) +++ testsuite/libgomp.c/fork-1.c (working copy) @@ -0,0 +1,77 @@ +/* { dg-do run } */ +/* { dg-timeout 10 } */ + +#include +#include +#include +#include +#include + +static int saw[4]; + +static void +check_parallel (int exit_on_failure) +{ + memset (saw, 0, sizeof (saw)); + #pragma omp parallel num_threads (2) + { + int iam = omp_get_thread_num (); + saw[iam] = 1; + } + + // Encode failure in status code to report to parent process + if (exit_on_failure) + { + if (saw[0] != 1) + _exit(1); + else if (saw[1] != 1) + _exit(2); + else if (saw[2] != 0) + _exit(3); + else if (saw[3] != 0) + _exit(4); + else + _exit(0); + } + // Use regular assertions + else + { + assert (saw[0] == 1); + assert (saw[1] == 1); + assert (saw[2] == 0); + assert (saw[3] == 0); + } +} + +int +main () +{ + // Initialize the OMP thread pool in the parent process + check_parallel (0); + pid_t fork_pid = fork(); + if (fork_pid == -1) + return 1; + else if (fork_pid == 0) + { + // Call OMP again in the child process and encode failures in exit + // code. + check_parallel (1); + } + else + { + // Check that OMP runtime is still functional in parent process after + // the fork. + check_parallel (0); + + // Wait for the child to finish and check the exit code. + int child_status = 0; + pid_t wait_pid = wait(&child_status); + assert (wait_pid == fork_pid); + assert (WEXITSTATUS (child_status) == 0); + + // Check that the termination of the child process did not impact + // OMP in parent process. + check_parallel (0); + } + return 0; +}