From patchwork Mon Nov 16 11:39:01 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tom de Vries X-Patchwork-Id: 544978 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 6CBBA14145F for ; Mon, 16 Nov 2015 22:39:54 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=ihsHjIpa; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:cc:from:message-id:date:mime-version :in-reply-to:content-type; q=dns; s=default; b=NLJhnduQ5LKctID4+ EftysR3lKcRh95viM3xhrAaEdkJAAx9QquoJo/QhiKyvKOxnaFkU+0tdeTQ2WfKe FxUR8lov6zWuS5kAAJWQgKREm/3OQoQlqCtgaqiHVPaMTXiuLVrKXc2bwFrHiV0Q PEN1XasqhRFYIOqfRovS6zpeiA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:cc:from:message-id:date:mime-version :in-reply-to:content-type; s=default; bh=SxA4noWAMGl0Ok2XLMazbc5 6sUs=; b=ihsHjIpaF9yH9dX8tScT7zmfoSPBNTGCQCTaTAtKfM7lW0Qdf5dkSLB jW1Sqte8JU60Sc9+YpD8CggBZq1l2eCXlRSZpVUv3VDLbeE90/zTGCyGOCZ7ioE+ ujFApRdc432o3numE5SDnX4mygqhulPQw4d4MsT9iKA327fqMAEg= Received: (qmail 18027 invoked by alias); 16 Nov 2015 11:39:48 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 18008 invoked by uid 89); 16 Nov 2015 11:39:47 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 X-HELO: fencepost.gnu.org Received: from fencepost.gnu.org (HELO fencepost.gnu.org) (208.118.235.10) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Mon, 16 Nov 2015 11:39:45 +0000 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52219) by fencepost.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1ZyI8N-0006tI-Oj for gcc-patches@gnu.org; Mon, 16 Nov 2015 06:39:43 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZyI8J-0004Zg-62 for gcc-patches@gnu.org; Mon, 16 Nov 2015 06:39:43 -0500 Received: from relay1.mentorg.com ([192.94.38.131]:54122) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZyI8I-0004ZZ-UM for gcc-patches@gnu.org; Mon, 16 Nov 2015 06:39:39 -0500 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-01.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1ZyI8I-0004wb-75 from Tom_deVries@mentor.com ; Mon, 16 Nov 2015 03:39:38 -0800 Received: from [127.0.0.1] (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server id 14.3.224.2; Mon, 16 Nov 2015 11:39:36 +0000 Subject: Re: [PATCH, 5/16] Add in_oacc_kernels_region in struct loop To: Richard Biener References: <5640BD31.2060602@mentor.com> <5640CA57.7090007@mentor.com> CC: "gcc-patches@gnu.org" , Jakub Jelinek From: Tom de Vries Message-ID: <5649C055.60709@mentor.com> Date: Mon, 16 Nov 2015 12:39:01 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: X-detected-operating-system: by eggs.gnu.org: Windows NT kernel [generic] [fuzzy] X-Received-From: 192.94.38.131 On 11/11/15 11:55, Richard Biener wrote: > On Mon, 9 Nov 2015, Tom de Vries wrote: > >> On 09/11/15 16:35, Tom de Vries wrote: >>> Hi, >>> >>> this patch series for stage1 trunk adds support to: >>> - parallelize oacc kernels regions using parloops, and >>> - map the loops onto the oacc gang dimension. >>> >>> The patch series contains these patches: >>> >>> 1 Insert new exit block only when needed in >>> transform_to_exit_first_loop_alt >>> 2 Make create_parallel_loop return void >>> 3 Ignore reduction clause on kernels directive >>> 4 Implement -foffload-alias >>> 5 Add in_oacc_kernels_region in struct loop >>> 6 Add pass_oacc_kernels >>> 7 Add pass_dominator_oacc_kernels >>> 8 Add pass_ch_oacc_kernels >>> 9 Add pass_parallelize_loops_oacc_kernels >>> 10 Add pass_oacc_kernels pass group in passes.def >>> 11 Update testcases after adding kernels pass group >>> 12 Handle acc loop directive >>> 13 Add c-c++-common/goacc/kernels-*.c >>> 14 Add gfortran.dg/goacc/kernels-*.f95 >>> 15 Add libgomp.oacc-c-c++-common/kernels-*.c >>> 16 Add libgomp.oacc-fortran/kernels-*.f95 >>> >>> The first 9 patches are more or less independent, but patches 10-16 are >>> intended to be committed at the same time. >>> >>> Bootstrapped and reg-tested on x86_64. >>> >>> Build and reg-tested with nvidia accelerator, in combination with a >>> patch that enables accelerator testing (which is submitted at >>> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ). >>> >>> I'll post the individual patches in reply to this message. >> >> this patch adds and initializes the field in_oacc_kernels_region field in >> struct loop. >> >> The field is used to signal to subsequent passes that we're dealing with a >> loop in a kernels region that we're trying parallelize. >> >> Note that we do not parallelize kernels regions with more than one loop nest. >> [ In general, kernels regions with more than one loop nest should be split up >> into seperate kernels regions, but that's not supported atm. ] > > I think mark_loops_in_oacc_kernels_region can be greatly simplified. > > Both region entry and exit should have the same ->loop_father (a SESE > region). Then you can just walk that loops inner (and their sibling) > loops checking their header domination relation with the region entry > exit (only necessary for direct inner loops). Updated patch to use the loops structure. Atm I'm also skipping loops containing sibling loops, since I have no test-cases for that yet. Thanks, - Tom Add in_oacc_kernels_region in struct loop 2015-11-09 Tom de Vries * cfgloop.h (struct loop): Add in_oacc_kernels_region field. * omp-low.c (mark_loops_in_oacc_kernels_region): New function. (expand_omp_target): Call mark_loops_in_oacc_kernels_region. --- gcc/cfgloop.h | 3 +++ gcc/omp-low.c | 43 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 46 insertions(+) diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h index 6af6893..ee73bf9 100644 --- a/gcc/cfgloop.h +++ b/gcc/cfgloop.h @@ -191,6 +191,9 @@ struct GTY ((chain_next ("%h.next"))) loop { /* True if we should try harder to vectorize this loop. */ bool force_vectorize; + /* True if the loop is part of an oacc kernels region. */ + bool in_oacc_kernels_region; + /* For SIMD loops, this is a unique identifier of the loop, referenced by IFN_GOMP_SIMD_VF, IFN_GOMP_SIMD_LANE and IFN_GOMP_SIMD_LAST_LANE builtins. */ diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 5f76434..fba7bbd 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -12450,6 +12450,46 @@ get_oacc_ifn_dim_arg (const gimple *stmt) return (int) axis; } +/* Mark the loops inside the kernels region starting at REGION_ENTRY and ending + at REGION_EXIT. */ + +static void +mark_loops_in_oacc_kernels_region (basic_block region_entry, + basic_block region_exit) +{ + struct loop *outer = region_entry->loop_father; + gcc_assert (region_exit == NULL || outer == region_exit->loop_father); + + /* Don't parallelize the kernels region if it contains more than one outer + loop. */ + unsigned int nr_outer_loops = 0; + struct loop *single_outer; + for (struct loop *loop = outer->inner; loop != NULL; loop = loop->next) + { + gcc_assert (loop_outer (loop) == outer); + + if (!dominated_by_p (CDI_DOMINATORS, loop->header, region_entry)) + continue; + + if (region_exit != NULL + && dominated_by_p (CDI_DOMINATORS, loop->header, region_exit)) + continue; + + nr_outer_loops++; + single_outer = loop; + } + if (nr_outer_loops != 1) + return; + + for (struct loop *loop = single_outer->inner; loop != NULL; loop = loop->inner) + if (loop->next) + return; + + /* Mark the loops in the region. */ + for (struct loop *loop = single_outer; loop != NULL; loop = loop->inner) + loop->in_oacc_kernels_region = true; +} + /* Expand the GIMPLE_OMP_TARGET starting at REGION. */ static void @@ -12505,6 +12545,9 @@ expand_omp_target (struct omp_region *region) entry_bb = region->entry; exit_bb = region->exit; + if (gimple_omp_target_kind (entry_stmt) == GF_OMP_TARGET_KIND_OACC_KERNELS) + mark_loops_in_oacc_kernels_region (region->entry, region->exit); + if (offloaded) { unsigned srcidx, dstidx, num;