From patchwork Tue Nov 3 18:11:59 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nathan Sidwell X-Patchwork-Id: 539572 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 8003A140187 for ; Wed, 4 Nov 2015 05:12:14 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=phLkX9Uz; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=gxwKU3+3VgoXKPg4Glz275T+Bn7uhLv9ggGuwYwBwY+0YQX1YW LSaOK0zypCJa1Z3zWCeinaJjE8QsMzMPZQFynEZrKl8EaK8SMlF0pV4mCJDdTPmI F/w6l6HaUn53T9On9Ne2OXa3WkMmDUSdiT8pTxxuwyX6ZuodNvzX1Vbl8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; s= default; bh=MT9N/G8dRU2HrBtEKHoXxYAUfqk=; b=phLkX9UzAuq07DQ3ceZD gPmHh45BzSOVhoqUr0jCLiNoJdR4ZYj3Z3BfypZpdRMFYAjUd8zuWb+EuJnk8GVt IMJQ3yjXDHU3ITskhhKZyfJ7sfm96GKQQ4XsKgZ9fOD+N7hkWo9qkShAOJCpp5nt l7NVYPMpEYycnLtGaz0Vmio= Received: (qmail 39889 invoked by alias); 3 Nov 2015 18:12:05 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 39874 invoked by uid 89); 3 Nov 2015 18:12:05 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.9 required=5.0 tests=BAYES_50, FREEMAIL_FROM, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=no version=3.3.2 X-HELO: mail-qg0-f54.google.com Received: from mail-qg0-f54.google.com (HELO mail-qg0-f54.google.com) (209.85.192.54) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Tue, 03 Nov 2015 18:12:03 +0000 Received: by qgem9 with SMTP id m9so20165404qge.1 for ; Tue, 03 Nov 2015 10:12:01 -0800 (PST) X-Received: by 10.140.18.172 with SMTP id 41mr13051194qgf.99.1446574321020; Tue, 03 Nov 2015 10:12:01 -0800 (PST) Received: from ?IPv6:2601:181:c000:c497:a2a8:cdff:fe3e:b48? ([2601:181:c000:c497:a2a8:cdff:fe3e:b48]) by smtp.googlemail.com with ESMTPSA id c103sm10122141qgd.36.2015.11.03.10.12.00 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 03 Nov 2015 10:12:00 -0800 (PST) To: Richard Guenther , GCC Patches From: Nathan Sidwell Subject: OpenACC dimension range propagation optimization Message-ID: <5638F8EF.6050607@acm.org> Date: Tue, 3 Nov 2015 13:11:59 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 Richard, this patch implements VRP for the 2 openacc axis internal fns I've added. We know the position within a dimension cannot exceed that dimensions extend. Further, if the extend is dynamic, the target backend may well know there's a hardware-mandated maximum value. Hence, added a new target hook to allow the backend to specify that upper bound, and added smarts to extract_range_basic to process the two internal functions. Incidentally, this was the bit I was working on at the cauldron, which caused me to work on the min/max range combining. ok for trunk? nathan 2015-11-03 Nathan Sidwell * target.def (goacc.dim_limit): New hook. * targhooks.h (default_goacc_dim_limit): Declare. * doc/tm.texi.in (TARGET_GOACC_DIM_LIMIT): Add. * doc/tm.texi: Rebuilt. * omp-low.c (default_goacc_dim_limit): New. * config/nvptx/nvptx.c (PTX_VECTOR_LENGTH, PTX_WORKER_LENGTH): New. (nvptx_goacc_dim_limit) New. (TARGET_GOACC_DIM_LIMIT): Override. * tree-vrp.c: Include omp-low.h, target.h. (extract_range_basic): Add handling for IFN_GOACC_DIM_SIZE & IFN_GOACC_DIM_POS. Index: gcc/targhooks.h =================================================================== --- gcc/targhooks.h (revision 229535) +++ gcc/targhooks.h (working copy) @@ -110,6 +110,7 @@ extern void default_destroy_cost_data (v /* OpenACC hooks. */ extern bool default_goacc_validate_dims (tree, int [], int); +extern int default_goacc_dim_limit (int); extern bool default_goacc_fork_join (gcall *, const int [], bool); /* These are here, and not in hooks.[ch], because not all users of Index: gcc/config/nvptx/nvptx.c =================================================================== --- gcc/config/nvptx/nvptx.c (revision 229535) +++ gcc/config/nvptx/nvptx.c (working copy) @@ -3248,6 +3248,10 @@ nvptx_file_end (void) } } +/* Define dimension sizes for known hardware. */ +#define PTX_VECTOR_LENGTH 32 +#define PTX_WORKER_LENGTH 32 + /* Validate compute dimensions of an OpenACC offload or routine, fill in non-unity defaults. FN_LEVEL indicates the level at which a routine might spawn a loop. It is negative for non-routines. */ @@ -3264,6 +3268,25 @@ nvptx_goacc_validate_dims (tree ARG_UNUS return changed; } +/* Return maximum dimension size, or zero for unbounded. */ + +static int +nvptx_goacc_dim_limit (int axis) +{ + switch (axis) + { + case GOMP_DIM_WORKER: + return PTX_WORKER_LENGTH; + + case GOMP_DIM_VECTOR: + return PTX_VECTOR_LENGTH; + + default: + break; + } + return 0; +} + /* Determine whether fork & joins are needed. */ static bool @@ -3376,6 +3399,9 @@ nvptx_goacc_fork_join (gcall *call, cons #undef TARGET_GOACC_VALIDATE_DIMS #define TARGET_GOACC_VALIDATE_DIMS nvptx_goacc_validate_dims +#undef TARGET_GOACC_DIM_LIMIT +#define TARGET_GOACC_DIM_LIMIT nvptx_goacc_dim_limit + #undef TARGET_GOACC_FORK_JOIN #define TARGET_GOACC_FORK_JOIN nvptx_goacc_fork_join Index: gcc/omp-low.c =================================================================== --- gcc/omp-low.c (revision 229535) +++ gcc/omp-low.c (working copy) @@ -19380,6 +19380,18 @@ default_goacc_validate_dims (tree ARG_UN return changed; } +/* Default dimension bound is unknown on accelerator and 1 on host. */ + +int +default_goacc_dim_limit (int ARG_UNUSED (axis)) +{ +#ifdef ACCEL_COMPILER + return 0; +#else + return 1; +#endif +} + namespace { const pass_data pass_data_oacc_device_lower = Index: gcc/tree-vrp.c =================================================================== --- gcc/tree-vrp.c (revision 229535) +++ gcc/tree-vrp.c (working copy) @@ -58,8 +58,8 @@ along with GCC; see the file COPYING3. #include "tree-ssa-threadupdate.h" #include "tree-ssa-scopedtables.h" #include "tree-ssa-threadedge.h" - - +#include "omp-low.h" +#include "target.h" /* Range of values that can be associated with an SSA_NAME after VRP has executed. */ @@ -3976,7 +3976,9 @@ extract_range_basic (value_range *vr, gi else if (is_gimple_call (stmt) && gimple_call_internal_p (stmt)) { enum tree_code subcode = ERROR_MARK; - switch (gimple_call_internal_fn (stmt)) + unsigned ifn_code = gimple_call_internal_fn (stmt); + + switch (ifn_code) { case IFN_UBSAN_CHECK_ADD: subcode = PLUS_EXPR; @@ -3987,6 +3989,37 @@ extract_range_basic (value_range *vr, gi case IFN_UBSAN_CHECK_MUL: subcode = MULT_EXPR; break; + case IFN_GOACC_DIM_SIZE: + case IFN_GOACC_DIM_POS: + /* Optimizing these two internal functions helps the loop + optimizer eliminate outer comparisons. Size is [1,N] + and pos is [0,N-1]. */ + { + bool is_pos = ifn_code == IFN_GOACC_DIM_POS; + tree attr = get_oacc_fn_attrib (current_function_decl); + tree arg = gimple_call_arg (stmt, 0); + int axis = TREE_INT_CST_LOW (arg); + tree dims = TREE_VALUE (attr); + + for (int ix = axis; ix--;) + dims = TREE_CHAIN (dims); + int size = TREE_INT_CST_LOW (TREE_VALUE (dims)); + + if (!size) + /* If it's dynamic, the backend might know a hardware + limitation. */ + size = targetm.goacc.dim_limit (axis); + + if (size) + { + tree type = TREE_TYPE (gimple_call_lhs (stmt)); + + set_value_range (vr, VR_RANGE, + build_int_cst (type, is_pos ? 0 : 1), + build_int_cst (type, size - is_pos), NULL); + } + } + return; default: break; } Index: gcc/doc/tm.texi =================================================================== --- gcc/doc/tm.texi (revision 229535) +++ gcc/doc/tm.texi (working copy) @@ -5777,6 +5777,11 @@ true, if changes have been made. You mu provide dimensions larger than 1. @end deftypefn +@deftypefn {Target Hook} int TARGET_GOACC_DIM_LIMIT (int @var{axis}) +This hook should return the maximum size of a particular dimension, +or zero if unbounded. +@end deftypefn + @deftypefn {Target Hook} bool TARGET_GOACC_FORK_JOIN (gcall *@var{call}, const int *@var{dims}, bool @var{is_fork}) This hook can be used to convert IFN_GOACC_FORK and IFN_GOACC_JOIN function calls to target-specific gimple, or indicate whether they Index: gcc/doc/tm.texi.in =================================================================== --- gcc/doc/tm.texi.in (revision 229535) +++ gcc/doc/tm.texi.in (working copy) @@ -4262,6 +4262,8 @@ address; but often a machine-dependent @hook TARGET_GOACC_VALIDATE_DIMS +@hook TARGET_GOACC_DIM_LIMIT + @hook TARGET_GOACC_FORK_JOIN @node Anchored Addresses Index: gcc/target.def =================================================================== --- gcc/target.def (revision 229535) +++ gcc/target.def (working copy) @@ -1659,6 +1659,13 @@ bool, (tree decl, int *dims, int fn_leve default_goacc_validate_dims) DEFHOOK +(dim_limit, +"This hook should return the maximum size of a particular dimension,\n\ +or zero if unbounded.", +int, (int axis), +default_goacc_dim_limit) + +DEFHOOK (fork_join, "This hook can be used to convert IFN_GOACC_FORK and IFN_GOACC_JOIN\n\ function calls to target-specific gimple, or indicate whether they\n\