From patchwork Mon Nov 9 16:31:19 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tom de Vries X-Patchwork-Id: 541832 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 98BDE14076E for ; Tue, 10 Nov 2015 03:32:03 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=e/ArDY/w; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:cc:from:message-id:date:mime-version :in-reply-to:content-type; q=dns; s=default; b=M4RonldtIiSEsunNP eBZpVKrxTUB+gvcsIoAsjSHUgC3JzPnQmK2u0hNcLg2lCtc+mZILRoqGeb/VCaqA d/DLay73PMfvjOhylVOl+cai5UltSetBNuGqEd5D5xZ3NQgxDAavD7z9jf8DqxjO vvFilMTmZUd7aLAp4qnuZhWkSU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:cc:from:message-id:date:mime-version :in-reply-to:content-type; s=default; bh=wCNA8GFTEm17MlAxXSIssnj ejUo=; b=e/ArDY/wUqco6TvlUkcDg9Aqp9ImBLyXJkQHlQQQqv/TU0X1LW0E5J2 aymRyH3Bo0ehwXvRReQdIInnloKVcfmZ+EHuCcBnBEYD76a8HkhaFYO5edRdLqmG 5JCs2X9I9c4mNinCu0Q81hBfUUZCNQ6PZ9yo4Mm90yE/p4RlvM68= Received: (qmail 88539 invoked by alias); 9 Nov 2015 16:31:56 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 88524 invoked by uid 89); 9 Nov 2015 16:31:55 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 X-HELO: fencepost.gnu.org Received: from fencepost.gnu.org (HELO fencepost.gnu.org) (208.118.235.10) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Mon, 09 Nov 2015 16:31:53 +0000 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50847) by fencepost.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1ZvpMF-000676-AG for gcc-patches@gnu.org; Mon, 09 Nov 2015 11:31:51 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZvpMA-0006LX-QL for gcc-patches@gnu.org; Mon, 09 Nov 2015 11:31:50 -0500 Received: from relay1.mentorg.com ([192.94.38.131]:53060) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZvpMA-0006LA-IR for gcc-patches@gnu.org; Mon, 09 Nov 2015 11:31:46 -0500 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-02.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1ZvpM9-0001IB-2i from Tom_deVries@mentor.com ; Mon, 09 Nov 2015 08:31:45 -0800 Received: from [127.0.0.1] (137.202.0.76) by SVR-IES-FEM-02.mgc.mentorg.com (137.202.0.106) with Microsoft SMTP Server id 14.3.224.2; Mon, 9 Nov 2015 16:31:42 +0000 Subject: [PATCH, 5/16] Add in_oacc_kernels_region in struct loop To: "gcc-patches@gnu.org" References: <5640BD31.2060602@mentor.com> CC: Jakub Jelinek , Richard Biener From: Tom de Vries Message-ID: <5640CA57.7090007@mentor.com> Date: Mon, 9 Nov 2015 17:31:19 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <5640BD31.2060602@mentor.com> X-detected-operating-system: by eggs.gnu.org: Windows NT kernel [generic] [fuzzy] X-Received-From: 192.94.38.131 On 09/11/15 16:35, Tom de Vries wrote: > Hi, > > this patch series for stage1 trunk adds support to: > - parallelize oacc kernels regions using parloops, and > - map the loops onto the oacc gang dimension. > > The patch series contains these patches: > > 1 Insert new exit block only when needed in > transform_to_exit_first_loop_alt > 2 Make create_parallel_loop return void > 3 Ignore reduction clause on kernels directive > 4 Implement -foffload-alias > 5 Add in_oacc_kernels_region in struct loop > 6 Add pass_oacc_kernels > 7 Add pass_dominator_oacc_kernels > 8 Add pass_ch_oacc_kernels > 9 Add pass_parallelize_loops_oacc_kernels > 10 Add pass_oacc_kernels pass group in passes.def > 11 Update testcases after adding kernels pass group > 12 Handle acc loop directive > 13 Add c-c++-common/goacc/kernels-*.c > 14 Add gfortran.dg/goacc/kernels-*.f95 > 15 Add libgomp.oacc-c-c++-common/kernels-*.c > 16 Add libgomp.oacc-fortran/kernels-*.f95 > > The first 9 patches are more or less independent, but patches 10-16 are > intended to be committed at the same time. > > Bootstrapped and reg-tested on x86_64. > > Build and reg-tested with nvidia accelerator, in combination with a > patch that enables accelerator testing (which is submitted at > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ). > > I'll post the individual patches in reply to this message. this patch adds and initializes the field in_oacc_kernels_region field in struct loop. The field is used to signal to subsequent passes that we're dealing with a loop in a kernels region that we're trying parallelize. Note that we do not parallelize kernels regions with more than one loop nest. [ In general, kernels regions with more than one loop nest should be split up into seperate kernels regions, but that's not supported atm. ] Thanks, - Tom Add in_oacc_kernels_region in struct loop 2015-11-09 Tom de Vries * cfgloop.h (struct loop): Add in_oacc_kernels_region field. * omp-low.c (mark_loops_in_oacc_kernels_region): New function. (expand_omp_target): Call mark_loops_in_oacc_kernels_region. --- gcc/cfgloop.h | 3 +++ gcc/omp-low.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 61 insertions(+) diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h index 6af6893..ee73bf9 100644 --- a/gcc/cfgloop.h +++ b/gcc/cfgloop.h @@ -191,6 +191,9 @@ struct GTY ((chain_next ("%h.next"))) loop { /* True if we should try harder to vectorize this loop. */ bool force_vectorize; + /* True if the loop is part of an oacc kernels region. */ + bool in_oacc_kernels_region; + /* For SIMD loops, this is a unique identifier of the loop, referenced by IFN_GOMP_SIMD_VF, IFN_GOMP_SIMD_LANE and IFN_GOMP_SIMD_LAST_LANE builtins. */ diff --git a/gcc/omp-low.c b/gcc/omp-low.c index d052c13..7121d73 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -12429,6 +12429,61 @@ get_oacc_ifn_dim_arg (const gimple *stmt) return (int) axis; } +/* Mark the loops inside the kernels region starting at REGION_ENTRY and ending + at REGION_EXIT. */ + +static void +mark_loops_in_oacc_kernels_region (basic_block region_entry, + basic_block region_exit) +{ + bitmap dominated_bitmap = BITMAP_GGC_ALLOC (); + bitmap excludes_bitmap = BITMAP_GGC_ALLOC (); + unsigned di; + basic_block bb; + + bitmap_clear (dominated_bitmap); + bitmap_clear (excludes_bitmap); + + /* Get all the blocks dominated by the region entry. That will include the + entire region. */ + vec dominated + = get_all_dominated_blocks (CDI_DOMINATORS, region_entry); + FOR_EACH_VEC_ELT (dominated, di, bb) + bitmap_set_bit (dominated_bitmap, bb->index); + + /* Exclude all the blocks which are not in the region: the blocks dominated by + the region exit. */ + if (region_exit != NULL) + { + vec excludes + = get_all_dominated_blocks (CDI_DOMINATORS, region_exit); + FOR_EACH_VEC_ELT (excludes, di, bb) + bitmap_set_bit (excludes_bitmap, bb->index); + } + + /* Don't parallelize the kernels region if it contains more than one outer + loop. */ + unsigned int nr_outer_loops = 0; + struct loop *loop; + FOR_EACH_LOOP (loop, 0) + { + if (loop_outer (loop) != current_loops->tree_root) + continue; + + if (bitmap_bit_p (dominated_bitmap, loop->header->index) + && !bitmap_bit_p (excludes_bitmap, loop->header->index)) + nr_outer_loops++; + } + if (nr_outer_loops != 1) + return; + + /* Mark the loops in the region. */ + FOR_EACH_LOOP (loop, 0) + if (bitmap_bit_p (dominated_bitmap, loop->header->index) + && !bitmap_bit_p (excludes_bitmap, loop->header->index)) + loop->in_oacc_kernels_region = true; +} + /* Expand the GIMPLE_OMP_TARGET starting at REGION. */ static void @@ -12483,6 +12538,9 @@ expand_omp_target (struct omp_region *region) entry_bb = region->entry; exit_bb = region->exit; + if (gimple_omp_target_kind (entry_stmt) == GF_OMP_TARGET_KIND_OACC_KERNELS) + mark_loops_in_oacc_kernels_region (region->entry, region->exit); + if (offloaded) { unsigned srcidx, dstidx, num; -- 1.9.1