From patchwork Mon Jul 17 08:41:09 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tom de Vries X-Patchwork-Id: 789313 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3x9xbh1MBYz9t16 for ; Mon, 17 Jul 2017 18:41:35 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="OdjiolN9"; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=FgY4BcBQ/F2ph3FlZ8ji8vm11a764qw0kyCHNeV1I8KwbtEaeC W3nps2SFvfMwMqbyeChK/qIrDMJTb11PebFs5U5augWm50lxq0XqkJO7lvQFhFgS C4aQLhi3TnxMfKXz8L3IUB/eg79YW0IihIVd4oldjcFKY8/5J4V5CcWMk= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; s= default; bh=fgUdUoFkHdo0fzMMv7AczblebQs=; b=OdjiolN9gPVGilz+7oPj ouDY7Xnrm0RI5c+wG+Hqjj4UnZAwm/7CZQ/00mXtQy4/75/Hf/f2MwFOv36qtyz2 mBoNP84JuXJH/jHZS7vGYANUVlRq9bA9LUXQ/DQhktyMc0mvHsLIs/xmogwYkWrz f2x8eeWhzPUlaR+nl0FXfa0= Received: (qmail 6524 invoked by alias); 17 Jul 2017 08:41:25 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 6470 invoked by uid 89); 17 Jul 2017 08:41:21 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.6 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS, URIBL_RED autolearn=ham version=3.3.2 spammy= X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 17 Jul 2017 08:41:19 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-MBX-04.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1dX1ae-000071-FQ from Tom_deVries@mentor.com for gcc-patches@gcc.gnu.org; Mon, 17 Jul 2017 01:41:16 -0700 Received: from [127.0.0.1] (137.202.0.87) by SVR-IES-MBX-04.mgc.mentorg.com (139.181.222.4) with Microsoft SMTP Server (TLS) id 15.0.1263.5; Mon, 17 Jul 2017 09:41:12 +0100 To: GCC Patches From: Tom de Vries Subject: [nvptx, committed, PR81069] Insert diverging jump alap in nvptx_single Message-ID: <55a722a5-4452-ed93-267e-e44b1d0572ed@mentor.com> Date: Mon, 17 Jul 2017 10:41:09 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 X-ClientProxiedBy: svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1) To SVR-IES-MBX-04.mgc.mentorg.com (139.181.222.4) Hi, Consider nvptx_single: ... /* Single neutering according to MASK. FROM is the incoming block and TO is the outgoing block. These may be the same block. Insert at start of FROM: if (tid.) goto end. and insert before ending branch of TO (if there is such an insn): end: We currently only use differnt FROM and TO when skipping an entire loop. We could do more if we detected superblocks. */ static void nvptx_single (unsigned mask, basic_block from, basic_block to) ... When compiling libgomp.oacc-fortran/nested-function-1.f90 at -O1, we observed the following pattern: ... : goto bb3; : (with single predecessor) ... which was translated by nvptx_single into: ... if (tid.) goto end. goto bb3; : end: ... There is no benefit to be gained from doing the goto bb3 in neutered mode, and there is no need to, so we might as well insert the neutering branch as late as possible: ... goto bb3; : if (tid.) goto end. end: ... This patch implements inserting the neutering branch as late as possible. [ As it happens, the actual code for libgomp.oacc-fortran/nested-function-1.f90 at -O1 was more complicated: there were other bbs inbetween bb2 and bb3. While this doesn't change anything from a control flow graph point of view, it did trigger a bug in the ptx JIT compiler where it inserts the synchronization point for the diverging branch later than the immediate post-dominator point at the end label. Consequently, the condition broadcast was executed in divergent mode (which is known to give undefined results), resulting in a hang. This patch also works around this ptx JIT compiler bug, for this test-case. ] Build and tested on x86_64 with nvptx accelerator. Committed. Thanks, - Tom Insert diverging jump alap in nvptx_single 2017-07-17 Tom de Vries PR target/81069 * config/nvptx/nvptx.c (nvptx_single): Insert diverging branch as late as possible. --- gcc/config/nvptx/nvptx.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c index daeec27..cb11686 100644 --- a/gcc/config/nvptx/nvptx.c +++ b/gcc/config/nvptx/nvptx.c @@ -3866,9 +3866,25 @@ nvptx_single (unsigned mask, basic_block from, basic_block to) rtx_insn *tail = BB_END (to); unsigned skip_mask = mask; - /* Find first insn of from block */ - while (head != BB_END (from) && !INSN_P (head)) - head = NEXT_INSN (head); + while (true) + { + /* Find first insn of from block. */ + while (head != BB_END (from) && !INSN_P (head)) + head = NEXT_INSN (head); + + if (from == to) + break; + + if (!(JUMP_P (head) && single_succ_p (from))) + break; + + basic_block jump_target = single_succ (from); + if (!single_pred_p (jump_target)) + break; + + from = jump_target; + head = BB_HEAD (from); + } /* Find last insn of to block */ rtx_insn *limit = from == to ? head : BB_HEAD (to);