From patchwork Wed Jun 3 11:45:27 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Julian Brown X-Patchwork-Id: 479898 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 17A88140291 for ; Wed, 3 Jun 2015 21:45:49 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=HHPL6oys; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-type; q=dns; s=default; b=IQ0G3fqp/CWI492/ GWTsZfzw3hAtiWnNCaAyRnXm8NmNRayxNJWuRlJ91AHCGLbIUpvI/LRiAP3IC7iS RJg8ELpjdBp0LqL4dnU7ZLJl/3OC3BIfjNR3N2kCr7GiWlw9W3RluWwqM+V05Ta6 /vX/sIxech+yPNWuvRSMqiPiHuA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-type; s=default; bh=wyF7FxSV4yIWNWx6VQoc0Y Wr4yM=; b=HHPL6oyszsW/SvH9sX2CEKr4MzgVOaoazUry6Iz/nf+sNX3e/LgTOs wxxrpKUw1faJDLbXlDI7mccVTwLUjv90exhbYQtzwvDarQTqRNXcqQkaMJLMMRuG SsopQl7jChEhnOO1fONRDTlRS9KeoMb+aIplPR6Z/5VAFACqLFv3A= Received: (qmail 66859 invoked by alias); 3 Jun 2015 11:45:43 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 66850 invoked by uid 89); 3 Jun 2015 11:45:42 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 03 Jun 2015 11:45:41 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-02.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1Z0772-0001Gc-N2 from Julian_Brown@mentor.com ; Wed, 03 Jun 2015 04:45:37 -0700 Received: from octopus (137.202.0.76) by SVR-IES-FEM-02.mgc.mentorg.com (137.202.0.106) with Microsoft SMTP Server id 14.3.224.2; Wed, 3 Jun 2015 12:45:35 +0100 Date: Wed, 3 Jun 2015 12:45:27 +0100 From: Julian Brown To: Richard Biener CC: GCC Patches , Bernd Schmidt , Jakub Jelinek , "Thomas Schwinge" Subject: Re: [gomp4] Preserve NVPTX "reconvergence" points Message-ID: <20150603124527.48eb7d6f@octopus> In-Reply-To: References: <20150528150635.7bd5db23@octopus> MIME-Version: 1.0 X-IsSubscribed: yes On Thu, 28 May 2015 16:37:04 +0200 Richard Biener wrote: > On Thu, May 28, 2015 at 4:06 PM, Julian Brown > wrote: > > For NVPTX, it is vitally important that the divergence of threads > > within a warp can be controlled: in particular we must be able to > > generate code that we know "reconverges" at a particular point. > > Unfortunately GCC's middle-end optimisers can cause this property to > > be violated, which causes problems for the OpenACC execution model > > we're planning to use for NVPTX. > > Hmm, I don't think adding a new edge flag is good nor necessary. It > seems to me that instead the broadcast operation should have abnormal > control flow and thus basic-blocks should be split either before or > after it (so either incoming or outgoing edge(s) should be > abnormal). I suppose splitting before the broadcast would be best > (thus handle it similar to setjmp ()). Here's a version of the patch that uses abnormal edges with semantics unchanged, splitting the "false"/non-execution edge using a dummy block to avoid the prohibited case of both EDGE_TRUE/EDGE_FALSE and EDGE_ABNORMAL on the outgoing edges of a GIMPLE_COND. So for a fragment like this: if (threadIdx.x == 0) /* cond_bb */ { /* work */ p0 = ...; /* assign */ } pN = broadcast(p0); if (pN) goto T; else goto F; Incoming edges to a broadcast operation have EDGE_ABNORMAL set: +--------+ |cond_bb |--------, +--------+ | | (true edge) | (false edge) v v +--------+ +-------+ | (work) | | dummy | +--------+ +-------+ | assign | | +--------+ | ABNORM| |ABNORM v | +--------+<-------' | bcast | +--------+ | cond | +--------+ / \ T F The abnormal edges actually serve two purposes, I think: as well as ensuring the broadcast operation takes place when a warp is non-diverged/coherent, they ensure that p0 is not seen as uninitialised along the "false" path from cond_bb, possibly leading to the broadcast operation being optimised away as partially redundant. This feels somewhat fragile though! We'll have to continue to think about warp divergence in subsequent patches. The patch passes libgomp testing (with Bernd's recent worker-single patch also). OK for gomp4 branch (together with the previously-mentioned inline thread builtin patch)? Thanks, Julian ChangeLog gcc/ * omp-low.c (make_predication_test): Split false block out of cond_bb, making latter edge abnormal. (predicate_bb): Set EDGE_ABNORMAL on edges before broadcast operations. commit 38056ae4a29f93ce54715dfad843a233f3b0fd2a Author: Julian Brown Date: Mon Jun 1 11:12:41 2015 -0700 Use abnormal edges before broadcast ops diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 7048f9f..310eb72 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -10555,7 +10555,16 @@ make_predication_test (edge true_edge, basic_block skip_dest_bb, int mask) gsi_insert_after (&tmp_gsi, cond_stmt, GSI_NEW_STMT); true_edge->flags = EDGE_TRUE_VALUE; - make_edge (cond_bb, skip_dest_bb, EDGE_FALSE_VALUE); + + /* Force an abnormal edge before a broadcast operation that might be present + in SKIP_DEST_BB. This is only done for the non-execution edge (with + respect to the predication done by this function) -- the opposite + (execution) edge that reaches the broadcast operation must be made + abnormal also, e.g. in this function's caller. */ + edge e = make_edge (cond_bb, skip_dest_bb, EDGE_FALSE_VALUE); + basic_block false_abnorm_bb = split_edge (e); + edge abnorm_edge = single_succ_edge (false_abnorm_bb); + abnorm_edge->flags |= EDGE_ABNORMAL; } /* Apply OpenACC predication to basic block BB which is in @@ -10605,6 +10614,7 @@ predicate_bb (basic_block bb, struct omp_region *parent, int mask) mask); edge e = split_block (bb, splitpoint); + e->flags = EDGE_ABNORMAL; skip_dest_bb = e->dest; gimple_cond_set_condition (as_a (stmt), EQ_EXPR, @@ -10624,6 +10634,7 @@ predicate_bb (basic_block bb, struct omp_region *parent, int mask) gsi_asgn, mask); edge e = split_block (bb, splitpoint); + e->flags = EDGE_ABNORMAL; skip_dest_bb = e->dest; gimple_switch_set_index (sstmt, new_var);