From patchwork Tue Aug 20 11:36:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chung-Lin Tang X-Patchwork-Id: 1150048 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-507365-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=mentor.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="lCh6MDDA"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46CTK94srYz9s3Z for ; Tue, 20 Aug 2019 21:36:43 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :reply-to:from:subject:to:message-id:date:mime-version :content-type; q=dns; s=default; b=JF+Nr/WKW/X5mzg32DC+Ii/NsoN6z NTNuKGutvcjFhsxoi1ujd2YQq+9u2PdR/z87dqSz9/L0bVcfyvqOrKDwz1ryIzu8 DHXCN3o8uQ1e64/IoGApS6opw47o51ovUXQmI9i5Hcj45DZXQeF2H5ESupJCTXlM /ooF1kq6AtxkJE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :reply-to:from:subject:to:message-id:date:mime-version :content-type; s=default; bh=Zb8mb9OAIP9EoYPWPPaBrXIK/vo=; b=lCh 6MDDAX55CHdgam5+tkZFSY3hC9Q+SlbrgONzAnqMj0frzKGIwmcAk925OYrXsDWx XSyZAmKfErmLcB9SpgR62l94QvL92XCvTR7WOHPYFMJZ8wgXfZPQflmyTQnH66X2 2dJZKj6oRlKXIzTHchvHQZdqLRuZ1CHagUxCz824= Received: (qmail 121901 invoked by alias); 20 Aug 2019 11:36:36 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 121893 invoked by uid 89); 20 Aug 2019 11:36:36 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-6.5 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_SHORT, SPF_PASS autolearn=ham version=3.3.1 spammy=patchset, HTo:U*thomas, H*Ad:U*thomas, site X-HELO: esa2.mentor.iphmx.com Received: from esa2.mentor.iphmx.com (HELO esa2.mentor.iphmx.com) (68.232.141.98) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 20 Aug 2019 11:36:33 +0000 IronPort-SDR: Y2F2mLcqR6RCjTLpnBSY98qRABD+OUSO+oLg7mhwNKWXgHxI2HNhoW7YYN53WJt2ZvKvUB0LOD 5ISDbMA3X5fBLVu8VHMdyquRPuUCFnkdBF2TkZVPBrA+Xl+P8DGafz1ClDLdd+xpchM0bkYZet Ug10hPnuQev0BuBKtitDyuvlzXAt6Rq6RrSN8zkHsjdXI/NliK0+aXMnH7oXZZnXSN7iKyoFJD lf396OsFyPCDixj/21mPag9LA9T7TSHlnobVevDar/ncp1iMcTWTa0a+iWANngGB+aQ5GEA0Fl ck0= Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa2.mentor.iphmx.com with ESMTP; 20 Aug 2019 03:36:31 -0800 IronPort-SDR: /wJDNm+N+jOX9HSGTd2trHKsNbwuOOahL+qp7Pdh3eG8GV9qLlagsdMSpQqoa/HMwfeyQa6Jvc ZbMbJKK5mRr51EDg/x8xFZgYohlf2HyYzzhJ6ET5omJERK8EivmscPOAcgaVPqIGXSm/iYj719 zbMtQJYqjhNIrlFL4yZ6V6SDeBgbS4FkAIpmsUR3QFZgBBlNmqEk67Y8jr0HfONkef8VaqfYe0 t4V43V6VTnLPAHvH165rJIRJ0TQEFJLucCHmK52sXydPlKA62UhCwwDXr0L4Uwahpo0vbZGNjH KWk= Reply-To: From: Chung-Lin Tang Subject: [PATCH, OpenACC, 1/3] Non-contiguous array support for OpenACC data clauses (re-submission), front-end patches To: gcc-patches , Jakub Jelinek , Thomas Schwinge Message-ID: Date: Tue, 20 Aug 2019 19:36:24 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 Hi Jakub, Thomas, this is a re-submission of the patch-set from [1]. The usage of the term "dynamic arrays" didn't go well with Jakub the last time, so this time I'm referring to this functionality as "non-contiguous arrays". int *a[100], **b; // re-constructs array slices on GPU and copies data in #pragma acc parallel copyin (a[0:n][0:m], b[1:x][5:y]) The overall implementation has not changed much from the last submission, mainly the renaming changes and rebasing to current trunk. The first patch here are the C/C++ front-end patches. Thanks, Chung-Lin [1] https://gcc.gnu.org/ml/gcc-patches/2018-10/msg00937.html gcc/c/ * c-typeck.c (handle_omp_array_sections_1): Add 'bool &non_contiguous' parameter, adjust recursive call site, add cases for allowing pointer based multi-dimensional arrays for OpenACC. (handle_omp_array_sections): Adjust handle_omp_array_sections_1 call, handle non-contiguous case to create dynamic array map. gcc/cp/ * semantics.c (handle_omp_array_sections_1): Add 'bool &non_contiguous' parameter, adjust recursive call site, add cases for allowing pointer based multi-dimensional arrays for OpenACC. (handle_omp_array_sections): Adjust handle_omp_array_sections_1 call, handle non-contiguous case to create dynamic array map. Index: gcc/c/c-typeck.c =================================================================== --- gcc/c/c-typeck.c (revision 274618) +++ gcc/c/c-typeck.c (working copy) @@ -12848,7 +12848,7 @@ c_finish_omp_cancellation_point (location_t loc, t static tree handle_omp_array_sections_1 (tree c, tree t, vec &types, bool &maybe_zero_len, unsigned int &first_non_one, - enum c_omp_region_type ort) + bool &non_contiguous, enum c_omp_region_type ort) { tree ret, low_bound, length, type; if (TREE_CODE (t) != TREE_LIST) @@ -12933,7 +12933,8 @@ handle_omp_array_sections_1 (tree c, tree t, vec 0 + && (TREE_CODE (low_bound) != INTEGER_CST + || integer_nonzerop (low_bound) + || (length && (TREE_CODE (length) != INTEGER_CST + || !tree_int_cst_equal (size, length))))) + { + tree x = types.last (); + if (TREE_CODE (x) == POINTER_TYPE) + non_contiguous = true; + } } else if (length == NULL_TREE) { @@ -13142,7 +13158,8 @@ handle_omp_array_sections_1 (tree c, tree t, vec types; tree *tp = &OMP_CLAUSE_DECL (c); if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_DEPEND @@ -13185,7 +13205,7 @@ handle_omp_array_sections (tree c, enum c_omp_regi tp = &TREE_VALUE (*tp); tree first = handle_omp_array_sections_1 (c, *tp, types, maybe_zero_len, first_non_one, - ort); + non_contiguous, ort); if (first == error_mark_node) return true; if (first == NULL_TREE) @@ -13218,6 +13238,7 @@ handle_omp_array_sections (tree c, enum c_omp_regi unsigned int num = types.length (), i; tree t, side_effects = NULL_TREE, size = NULL_TREE; tree condition = NULL_TREE; + tree ncarray_dims = NULL_TREE; if (int_size_in_bytes (TREE_TYPE (first)) <= 0) maybe_zero_len = true; @@ -13241,6 +13262,13 @@ handle_omp_array_sections (tree c, enum c_omp_regi length = fold_convert (sizetype, length); if (low_bound == NULL_TREE) low_bound = integer_zero_node; + + if (non_contiguous) + { + ncarray_dims = tree_cons (low_bound, length, ncarray_dims); + continue; + } + if (!maybe_zero_len && i > first_non_one) { if (integer_nonzerop (low_bound)) @@ -13337,6 +13365,14 @@ handle_omp_array_sections (tree c, enum c_omp_regi size = size_binop (MULT_EXPR, size, l); } } + if (non_contiguous) + { + int kind = OMP_CLAUSE_MAP_KIND (c); + OMP_CLAUSE_SET_MAP_KIND (c, kind | GOMP_MAP_NONCONTIG_ARRAY); + OMP_CLAUSE_DECL (c) = t; + OMP_CLAUSE_SIZE (c) = ncarray_dims; + return false; + } if (side_effects) size = build2 (COMPOUND_EXPR, sizetype, side_effects, size); if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION Index: gcc/cp/semantics.c =================================================================== --- gcc/cp/semantics.c (revision 274618) +++ gcc/cp/semantics.c (working copy) @@ -4626,7 +4626,7 @@ omp_privatize_field (tree t, bool shared) static tree handle_omp_array_sections_1 (tree c, tree t, vec &types, bool &maybe_zero_len, unsigned int &first_non_one, - enum c_omp_region_type ort) + bool &non_contiguous, enum c_omp_region_type ort) { tree ret, low_bound, length, type; if (TREE_CODE (t) != TREE_LIST) @@ -4711,7 +4711,8 @@ handle_omp_array_sections_1 (tree c, tree t, vec 0 + && (TREE_CODE (low_bound) != INTEGER_CST + || integer_nonzerop (low_bound) + || (length && (TREE_CODE (length) != INTEGER_CST + || !tree_int_cst_equal (size, length))))) + { + tree x = types.last (); + if (TREE_CODE (x) == POINTER_TYPE) + non_contiguous = true; + } } else if (length == NULL_TREE) { @@ -4932,7 +4948,8 @@ handle_omp_array_sections_1 (tree c, tree t, vec types; tree *tp = &OMP_CLAUSE_DECL (c); if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_DEPEND @@ -4975,7 +4995,7 @@ handle_omp_array_sections (tree c, enum c_omp_regi tp = &TREE_VALUE (*tp); tree first = handle_omp_array_sections_1 (c, *tp, types, maybe_zero_len, first_non_one, - ort); + non_contiguous, ort); if (first == error_mark_node) return true; if (first == NULL_TREE) @@ -5009,6 +5029,7 @@ handle_omp_array_sections (tree c, enum c_omp_regi unsigned int num = types.length (), i; tree t, side_effects = NULL_TREE, size = NULL_TREE; tree condition = NULL_TREE; + tree ncarray_dims = NULL_TREE; if (int_size_in_bytes (TREE_TYPE (first)) <= 0) maybe_zero_len = true; @@ -5034,6 +5055,13 @@ handle_omp_array_sections (tree c, enum c_omp_regi length = fold_convert (sizetype, length); if (low_bound == NULL_TREE) low_bound = integer_zero_node; + + if (non_contiguous) + { + ncarray_dims = tree_cons (low_bound, length, ncarray_dims); + continue; + } + if (!maybe_zero_len && i > first_non_one) { if (integer_nonzerop (low_bound)) @@ -5125,6 +5153,14 @@ handle_omp_array_sections (tree c, enum c_omp_regi } if (!processing_template_decl) { + if (non_contiguous) + { + int kind = OMP_CLAUSE_MAP_KIND (c); + OMP_CLAUSE_SET_MAP_KIND (c, kind | GOMP_MAP_NONCONTIG_ARRAY); + OMP_CLAUSE_DECL (c) = t; + OMP_CLAUSE_SIZE (c) = ncarray_dims; + return false; + } if (side_effects) size = build2 (COMPOUND_EXPR, sizetype, side_effects, size); if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION From patchwork Tue Aug 20 11:36:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chung-Lin Tang X-Patchwork-Id: 1150049 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-507366-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=mentor.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="WkN6sUeA"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46CTKT3ZbDz9s3Z for ; Tue, 20 Aug 2019 21:37:01 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :reply-to:from:subject:to:message-id:date:mime-version :content-type; q=dns; s=default; b=P/uDmODwIJDOe2AeItp8HBIzV3EiD OI0q0saW7fcHRQxdL3EV+q7+mQ3acT/rforzMiq0Ejl5kQEOQjiEcnaZJR9A27co wJGwdiIQ2nPBx1kj/njbkd05GAnU7mran9OgoP2g8NNkK0NmKXjs96m2JOWXYxVl R2BZh1TNY9otWY= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :reply-to:from:subject:to:message-id:date:mime-version :content-type; s=default; bh=/1bjhw8546UkdL1jlFrrxmsD8Uo=; b=WkN 6sUeA9XjmQcXCbfPMMa1xYzw+ysQMAaMQe9NGYAvVKJ67cBnjYzM+c+BfgpGv8kn zM35RRt3zEXVlCzdGCd4St++7By7n4ki61q91MMdZpK51PykeWMwWXaD5+VC57nb yBPK0m/Rek3gZFyP0Kwyx3sDV9eor5n+lOmA6Zgc= Received: (qmail 123565 invoked by alias); 20 Aug 2019 11:36:53 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 123548 invoked by uid 89); 20 Aug 2019 11:36:52 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-8.8 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_SHORT autolearn=ham version=3.3.1 spammy=proceeds, Point, adjustment X-HELO: esa1.mentor.iphmx.com Received: from esa1.mentor.iphmx.com (HELO esa1.mentor.iphmx.com) (68.232.129.153) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 20 Aug 2019 11:36:48 +0000 IronPort-SDR: 10GpNUsYSsr/wkSIhfs6n9kdl1ZZGJW9uyfoofeccfd8uZN1Q6dc5ytHYnRMCnbjJVOoI1BbdE AxRymkYbNOlSqVyYRyuwRMN7Zkx9jFTN0fdOUdLU88UTIzcV0dY7c+JeJyQhP/sAeGM9DukxDH 93VX9nInORLSFFfgIzKAInxZmt0dYIsix6ydrl4sp6bixTybpeAFdhvKfjoj/OJXiA/rYIozv+ NnqbU5a50G2/eQK0O4VDBaI8JCiG/91BdzLXNWrEXYJ1QG/VMbzH3NJRb2e1eukbtWwfKgSmRG TK0= Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa1.mentor.iphmx.com with ESMTP; 20 Aug 2019 03:36:46 -0800 IronPort-SDR: /WRztD17MxLHr8XDxYR+lG5C8OrrecLyePGRnAa8BlBnt7AOqKBgvjNPXgX79KzaI6F0pTYMpO c0uZs3cFRmrp4ewDt865YSOZmhQMA0vGKLl1+Nk7q4tNNgMQauEQFAny/lYIUDr7QMp+y8iTcO ygxV0N2SbGIT71I7A8PhBb/b2zBholT2idwOC3RvBc79ELpKPmMpw6YH7pf2l+USVfvW+UqaRm 7k7wg7V+XMyNQHnunsKVJZfwjUG/KwJ6R0qbcilKNRznieQvm/924tZTRHSOKNUE60c4+ieWj1 VWI= Reply-To: From: Chung-Lin Tang Subject: [PATCH, OpenACC, 2/3] Non-contiguous array support for OpenACC data clauses (re-submission), compiler patches To: gcc-patches , Jakub Jelinek , Thomas Schwinge Message-ID: <43a11910-525e-13da-757e-a3f5046d5a10@mentor.com> Date: Tue, 20 Aug 2019 19:36:39 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 These are the patches for gimplify, omp-low, and include/gomp-constants.h On issue that Jakub raised in the last review email on omp-low changes [1], was the use of DECL_IGNORED_P. Because the descriptor variables are created was create_tmp_var_raw(), they already have DECL_IGNORED_P set, so this shouldn't be of issue here. The use of '$' in identifier names have also been removed. [1] https://gcc.gnu.org/ml/gcc-patches/2018-12/msg01297.html Thanks, Chung-Lin gcc/ * gimplify.c (gimplify_scan_omp_clauses): For non-contiguous array map kinds, make sure bias in each dimension are put into firstprivate variables. * omp-low.c (struct omp_context): Add 'hash_map *non_contiguous_arrays' field, also added include of "tree-hash-traits.h". (append_field_to_record_type): New function. (create_noncontig_array_descr_type): Likewise. (create_noncontig_array_descr_init_code): Likewise. (new_omp_context): Add initialize of non_contiguous_arrays field. (delete_omp_context): Add delete of non_contiguous_arrays field. (scan_sharing_clauses): For non-contiguous array map kinds, check for supported dimension structure, and install non-contiguous array variable into current omp_context. (lower_omp_target): Add handling for non-contiguous array map kinds. (noncontig_array_lookup): New function. (noncontig_array_reference_start): Likewise. (scan_for_op): Likewise. (scan_for_reference): Likewise. (ncarray_create_bias): Likewise. (ncarray_dimension_peel): Likewise. (lower_omp_1): Add case to look for start of non-contiguous array reference, and handle bias adjustments for the code sequence. * tree-pretty-print.c (dump_omp_clauses): Add cases for printing GOMP_MAP_NONCONTIG_ARRAY map kinds. include/ * gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_3): Define. (enum gomp_map_kind): Add GOMP_MAP_NONCONTIG_ARRAY, GOMP_MAP_NONCONTIG_ARRAY_TO, GOMP_MAP_NONCONTIG_ARRAY_FROM, GOMP_MAP_NONCONTIG_ARRAY_TOFROM, GOMP_MAP_NONCONTIG_ARRAY_FORCE_TO, GOMP_MAP_NONCONTIG_ARRAY_FORCE_FROM, GOMP_MAP_NONCONTIG_ARRAY_FORCE_TOFROM, GOMP_MAP_NONCONTIG_ARRAY_ALLOC, GOMP_MAP_NONCONTIG_ARRAY_FORCE_ALLOC, GOMP_MAP_NONCONTIG_ARRAY_FORCE_PRESENT. (GOMP_MAP_NONCONTIG_ARRAY_P): Define. Index: gcc/gimplify.c =================================================================== --- gcc/gimplify.c (revision 274618) +++ gcc/gimplify.c (working copy) @@ -8563,9 +8563,29 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_se if (OMP_CLAUSE_SIZE (c) == NULL_TREE) OMP_CLAUSE_SIZE (c) = DECL_P (decl) ? DECL_SIZE_UNIT (decl) : TYPE_SIZE_UNIT (TREE_TYPE (decl)); - if (gimplify_expr (&OMP_CLAUSE_SIZE (c), pre_p, - NULL, is_gimple_val, fb_rvalue) == GS_ERROR) + if (OMP_CLAUSE_SIZE (c) + && TREE_CODE (OMP_CLAUSE_SIZE (c)) == TREE_LIST + && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c))) { + tree dims = OMP_CLAUSE_SIZE (c); + for (tree t = dims; t; t = TREE_CHAIN (t)) + { + /* If a dimension bias isn't a constant, we have to ensure + that the value gets transferred to the offload target. */ + tree low_bound = TREE_PURPOSE (t); + if (TREE_CODE (low_bound) != INTEGER_CST) + { + low_bound = get_initialized_tmp_var (low_bound, pre_p, + NULL, false); + omp_add_variable (ctx, low_bound, + GOVD_FIRSTPRIVATE | GOVD_SEEN); + TREE_PURPOSE (t) = low_bound; + } + } + } + else if (gimplify_expr (&OMP_CLAUSE_SIZE (c), pre_p, + NULL, is_gimple_val, fb_rvalue) == GS_ERROR) + { remove = true; break; } Index: gcc/omp-low.c =================================================================== --- gcc/omp-low.c (revision 274618) +++ gcc/omp-low.c (working copy) @@ -60,6 +60,7 @@ along with GCC; see the file COPYING3. If not see #include "hsa-common.h" #include "stringpool.h" #include "attribs.h" +#include "tree-hash-traits.h" /* Lowering of OMP parallel and workshare constructs proceeds in two phases. The first phase scans the function looking for OMP statements @@ -127,6 +128,9 @@ struct omp_context corresponding tracking loop iteration variables. */ hash_map *lastprivate_conditional_map; + /* Hash map of non-contiguous arrays in this context. */ + hash_map *non_contiguous_arrays; + /* Nesting depth of this context. Used to beautify error messages re invalid gotos. The outermost ctx is depth 1, with depth 0 being reserved for the main body of the function. */ @@ -885,6 +889,137 @@ omp_copy_decl (tree var, copy_body_data *cb) return error_mark_node; } +/* Helper function for create_noncontig_array_descr_type(), to append a new field + to a record type. */ + +static void +append_field_to_record_type (tree record_type, tree fld_ident, tree fld_type) +{ + tree *p, fld = build_decl (UNKNOWN_LOCATION, FIELD_DECL, fld_ident, fld_type); + DECL_CONTEXT (fld) = record_type; + + for (p = &TYPE_FIELDS (record_type); *p; p = &DECL_CHAIN (*p)) + ; + *p = fld; +} + +/* Create type for non-contiguous array descriptor. Returns created type, and + returns the number of dimensions in *DIM_NUM. */ + +static tree +create_noncontig_array_descr_type (tree decl, tree dims, int *dim_num) +{ + int n = 0; + tree array_descr_type, name, x; + gcc_assert (TREE_CODE (dims) == TREE_LIST); + + array_descr_type = lang_hooks.types.make_type (RECORD_TYPE); + name = create_tmp_var_name (".omp_noncontig_array_descr_type"); + name = build_decl (UNKNOWN_LOCATION, TYPE_DECL, name, array_descr_type); + DECL_ARTIFICIAL (name) = 1; + DECL_NAMELESS (name) = 1; + TYPE_NAME (array_descr_type) = name; + TYPE_ARTIFICIAL (array_descr_type) = 1; + + /* Main starting pointer/array. */ + tree main_var_type = TREE_TYPE (decl); + if (TREE_CODE (main_var_type) == REFERENCE_TYPE) + main_var_type = TREE_TYPE (main_var_type); + append_field_to_record_type (array_descr_type, DECL_NAME (decl), + (TREE_CODE (TREE_TYPE (decl)) == POINTER_TYPE + ? main_var_type + : build_pointer_type (main_var_type))); + /* Number of dimensions. */ + append_field_to_record_type (array_descr_type, get_identifier ("__dim_num"), + sizetype); + + for (x = dims; x; x = TREE_CHAIN (x), n++) + { + char *fldname; + /* One for the start index. */ + ASM_FORMAT_PRIVATE_NAME (fldname, "__dim_base", n); + append_field_to_record_type (array_descr_type, get_identifier (fldname), + sizetype); + /* One for the length. */ + ASM_FORMAT_PRIVATE_NAME (fldname, "__dim_length", n); + append_field_to_record_type (array_descr_type, get_identifier (fldname), + sizetype); + /* One for the element size. */ + ASM_FORMAT_PRIVATE_NAME (fldname, "__dim_elem_size", n); + append_field_to_record_type (array_descr_type, get_identifier (fldname), + sizetype); + /* One for is_array flag. */ + ASM_FORMAT_PRIVATE_NAME (fldname, "__dim_is_array", n); + append_field_to_record_type (array_descr_type, get_identifier (fldname), + sizetype); + } + + layout_type (array_descr_type); + *dim_num = n; + return array_descr_type; +} + +/* Generate code sequence for initializing non-contiguous array descriptor. */ + +static void +create_noncontig_array_descr_init_code (tree array_descr, tree array_var, + tree dimensions, int dim_num, + gimple_seq *ilist) +{ + tree fld, fldref; + tree array_descr_type = TREE_TYPE (array_descr); + tree dim_type = TREE_TYPE (array_var); + + fld = TYPE_FIELDS (array_descr_type); + fldref = omp_build_component_ref (array_descr, fld); + gimplify_assign (fldref, (TREE_CODE (dim_type) == ARRAY_TYPE + ? build_fold_addr_expr (array_var) : array_var), + ilist); + + if (TREE_CODE (dim_type) == REFERENCE_TYPE) + dim_type = TREE_TYPE (dim_type); + + fld = TREE_CHAIN (fld); + fldref = omp_build_component_ref (array_descr, fld); + gimplify_assign (fldref, build_int_cst (sizetype, dim_num), ilist); + + while (dimensions) + { + tree dim_base = fold_convert (sizetype, TREE_PURPOSE (dimensions)); + tree dim_length = fold_convert (sizetype, TREE_VALUE (dimensions)); + tree dim_elem_size = TYPE_SIZE_UNIT (TREE_TYPE (dim_type)); + tree dim_is_array = (TREE_CODE (dim_type) == ARRAY_TYPE + ? integer_one_node : integer_zero_node); + /* Set base. */ + fld = TREE_CHAIN (fld); + fldref = omp_build_component_ref (array_descr, fld); + dim_base = fold_build2 (MULT_EXPR, sizetype, dim_base, dim_elem_size); + gimplify_assign (fldref, dim_base, ilist); + + /* Set length. */ + fld = TREE_CHAIN (fld); + fldref = omp_build_component_ref (array_descr, fld); + dim_length = fold_build2 (MULT_EXPR, sizetype, dim_length, dim_elem_size); + gimplify_assign (fldref, dim_length, ilist); + + /* Set elem_size. */ + fld = TREE_CHAIN (fld); + fldref = omp_build_component_ref (array_descr, fld); + dim_elem_size = fold_convert (sizetype, dim_elem_size); + gimplify_assign (fldref, dim_elem_size, ilist); + + /* Set is_array flag. */ + fld = TREE_CHAIN (fld); + fldref = omp_build_component_ref (array_descr, fld); + dim_is_array = fold_convert (sizetype, dim_is_array); + gimplify_assign (fldref, dim_is_array, ilist); + + dimensions = TREE_CHAIN (dimensions); + dim_type = TREE_TYPE (dim_type); + } + gcc_assert (TREE_CHAIN (fld) == NULL_TREE); +} + /* Create a new context, with OUTER_CTX being the surrounding context. */ static omp_context * @@ -921,6 +1056,8 @@ new_omp_context (gimple *stmt, omp_context *outer_ ctx->cb.decl_map = new hash_map; + ctx->non_contiguous_arrays = new hash_map; + return ctx; } @@ -1003,6 +1140,8 @@ delete_omp_context (splay_tree_value value) delete ctx->lastprivate_conditional_map; + delete ctx->non_contiguous_arrays; + XDELETE (ctx); } @@ -1353,6 +1492,42 @@ scan_sharing_clauses (tree clauses, omp_context *c install_var_local (decl, ctx); break; } + + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP + && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c))) + { + tree array_decl = OMP_CLAUSE_DECL (c); + tree array_dimensions = OMP_CLAUSE_SIZE (c); + tree array_type = TREE_TYPE (array_decl); + bool by_ref = (TREE_CODE (array_type) == ARRAY_TYPE + ? true : false); + + /* Checking code to ensure we only have arrays at top dimension. + This limitation might be lifted in the future. */ + if (TREE_CODE (array_type) == REFERENCE_TYPE) + array_type = TREE_TYPE (array_type); + tree t = array_type, prev_t = NULL_TREE; + while (t) + { + if (TREE_CODE (t) == ARRAY_TYPE && prev_t) + { + error_at (gimple_location (ctx->stmt), "array types are" + " only allowed at outermost dimension of" + " non-contiguous array"); + break; + } + prev_t = t; + t = TREE_TYPE (t); + } + + install_var_field (array_decl, by_ref, 3, ctx); + tree new_var = install_var_local (array_decl, ctx); + + bool existed = ctx->non_contiguous_arrays->put (new_var, array_dimensions); + gcc_assert (!existed); + break; + } + if (DECL_P (decl)) { if (DECL_SIZE (decl) @@ -2583,6 +2758,50 @@ scan_omp_single (gomp_single *stmt, omp_context *o layout_type (ctx->record_type); } +/* Reorder clauses so that non-contiguous array map clauses are placed at the very + front of the chain. */ + +static void +reorder_noncontig_array_clauses (tree *clauses_ptr) +{ + tree c, clauses = *clauses_ptr; + tree prev_clause = NULL_TREE, next_clause; + tree array_clauses = NULL_TREE, array_clauses_tail = NULL_TREE; + + for (c = clauses; c; c = next_clause) + { + next_clause = OMP_CLAUSE_CHAIN (c); + + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP + && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c))) + { + /* Unchain c from clauses. */ + if (c == clauses) + clauses = next_clause; + + /* Link on to array_clauses. */ + if (array_clauses_tail) + OMP_CLAUSE_CHAIN (array_clauses_tail) = c; + else + array_clauses = c; + array_clauses_tail = c; + + if (prev_clause) + OMP_CLAUSE_CHAIN (prev_clause) = next_clause; + continue; + } + + prev_clause = c; + } + + /* Place non-contiguous array clauses at the start of the clause list. */ + if (array_clauses) + { + OMP_CLAUSE_CHAIN (array_clauses_tail) = clauses; + *clauses_ptr = array_clauses; + } +} + /* Scan a GIMPLE_OMP_TARGET. */ static void @@ -2591,7 +2810,6 @@ scan_omp_target (gomp_target *stmt, omp_context *o omp_context *ctx; tree name; bool offloaded = is_gimple_omp_offloaded (stmt); - tree clauses = gimple_omp_target_clauses (stmt); ctx = new_omp_context (stmt, outer_ctx); ctx->field_map = splay_tree_new (splay_tree_compare_pointers, 0, 0); @@ -2610,6 +2828,14 @@ scan_omp_target (gomp_target *stmt, omp_context *o gimple_omp_target_set_child_fn (stmt, ctx->cb.dst_fn); } + /* If is OpenACC construct, put non-contiguous array clauses (if any) + in front of clause chain. The runtime can then test the first to see + if the additional map processing for them is required. */ + if (is_gimple_omp_oacc (stmt)) + reorder_noncontig_array_clauses (gimple_omp_target_clauses_ptr (stmt)); + + tree clauses = gimple_omp_target_clauses (stmt); + scan_sharing_clauses (clauses, ctx); scan_omp (gimple_omp_body_ptr (stmt), ctx); @@ -11326,6 +11552,15 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp case GOMP_MAP_FORCE_PRESENT: case GOMP_MAP_FORCE_DEVICEPTR: case GOMP_MAP_DEVICE_RESIDENT: + case GOMP_MAP_NONCONTIG_ARRAY_TO: + case GOMP_MAP_NONCONTIG_ARRAY_FROM: + case GOMP_MAP_NONCONTIG_ARRAY_TOFROM: + case GOMP_MAP_NONCONTIG_ARRAY_FORCE_TO: + case GOMP_MAP_NONCONTIG_ARRAY_FORCE_FROM: + case GOMP_MAP_NONCONTIG_ARRAY_FORCE_TOFROM: + case GOMP_MAP_NONCONTIG_ARRAY_ALLOC: + case GOMP_MAP_NONCONTIG_ARRAY_FORCE_ALLOC: + case GOMP_MAP_NONCONTIG_ARRAY_FORCE_PRESENT: case GOMP_MAP_LINK: gcc_assert (is_gimple_omp_oacc (stmt)); break; @@ -11388,7 +11623,14 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp if (offloaded && !(OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP && OMP_CLAUSE_MAP_IN_REDUCTION (c))) { - x = build_receiver_ref (var, true, ctx); + tree var_type = TREE_TYPE (var); + bool rcv_by_ref = + (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP + && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c)) + && TREE_CODE (var_type) != ARRAY_TYPE + ? false : true); + + x = build_receiver_ref (var, rcv_by_ref, ctx); tree new_var = lookup_decl (var, ctx); if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP @@ -11635,6 +11877,24 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp avar = build_fold_addr_expr (avar); gimplify_assign (x, avar, &ilist); } + else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP + && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c))) + { + int dim_num; + tree dimensions = OMP_CLAUSE_SIZE (c); + + tree array_descr_type = + create_noncontig_array_descr_type (OMP_CLAUSE_DECL (c), + dimensions, &dim_num); + tree array_descr = + create_tmp_var_raw (array_descr_type, ".omp_noncontig_array_descr"); + gimple_add_tmp_var (array_descr); + + create_noncontig_array_descr_init_code + (array_descr, ovar, dimensions, dim_num, &ilist); + + gimplify_assign (x, build_fold_addr_expr (array_descr), &ilist); + } else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FIRSTPRIVATE) { gcc_assert (is_gimple_omp_oacc (ctx->stmt)); @@ -11695,6 +11955,9 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp s = TREE_TYPE (s); s = TYPE_SIZE_UNIT (s); } + else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP + && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c))) + s = NULL_TREE; else s = OMP_CLAUSE_SIZE (c); if (s == NULL_TREE) @@ -12384,7 +12647,202 @@ lower_omp_grid_body (gimple_stmt_iterator *gsi_p, gimple_build_omp_return (false)); } +/* Helper to lookup non-contiguous arrays through nested omp contexts. Returns + TREE_LIST of dimensions, and the CTX where it was found in *CTX_P. */ +static tree +noncontig_array_lookup (tree t, omp_context **ctx_p) +{ + omp_context *c = *ctx_p; + while (c) + { + tree *dims = c->non_contiguous_arrays->get (t); + if (dims) + { + *ctx_p = c; + return *dims; + } + c = c->outer; + } + return NULL_TREE; +} + +/* Tests if this gimple STMT is the start of a non-contiguous array access + sequence. Returns true if found, and also returns the gimple operand ptr + and dimensions tree list through *OUT_REF and *OUT_DIMS respectively. */ + +static bool +noncontig_array_reference_start (gimple *stmt, omp_context **ctx_p, + tree **out_ref, tree *out_dims) +{ + if (gimple_code (stmt) == GIMPLE_ASSIGN) + for (unsigned i = 1; i < gimple_num_ops (stmt); i++) + { + tree *op = gimple_op_ptr (stmt, i), dims; + if (TREE_CODE (*op) == ARRAY_REF) + op = &TREE_OPERAND (*op, 0); + if (TREE_CODE (*op) == MEM_REF) + op = &TREE_OPERAND (*op, 0); + if ((dims = noncontig_array_lookup (*op, ctx_p)) != NULL_TREE) + { + *out_ref = op; + *out_dims = dims; + return true; + } + } + return false; +} + +static tree +scan_for_op (tree *tp, int *walk_subtrees, void *data) +{ + struct walk_stmt_info *wi = (struct walk_stmt_info *) data; + tree t = *tp; + tree op = (tree) wi->info; + *walk_subtrees = 1; + if (operand_equal_p (t, op, 0)) + { + wi->info = tp; + return t; + } + return NULL_TREE; +} + +static tree * +scan_for_reference (gimple *stmt, tree op) +{ + struct walk_stmt_info wi; + memset (&wi, 0, sizeof (wi)); + wi.info = op; + if (walk_gimple_op (stmt, scan_for_op, &wi)) + return (tree *) wi.info; + return NULL; +} + +static tree +ncarray_create_bias (tree orig_bias, tree unit_type) +{ + return build2 (MULT_EXPR, sizetype, fold_convert (sizetype, orig_bias), + TYPE_SIZE_UNIT (unit_type)); +} + +/* Main worker for adjusting non-contiguous array accesses, handles the + adjustment of many cases of statement forms, and called multiple times + to 'peel' away each dimension. */ + +static gimple_stmt_iterator +ncarray_dimension_peel (omp_context *ctx, + gimple_stmt_iterator gsi, tree orig_da, + tree *op_ptr, tree *type_ptr, tree *dims_ptr) +{ + gimple *stmt = gsi_stmt (gsi); + tree lhs = gimple_assign_lhs (stmt); + tree rhs = gimple_assign_rhs1 (stmt); + + if (gimple_num_ops (stmt) == 2 + && TREE_CODE (rhs) == MEM_REF + && operand_equal_p (*op_ptr, TREE_OPERAND (rhs, 0), 0) + && !operand_equal_p (orig_da, TREE_OPERAND (rhs, 0), 0) + && (TREE_OPERAND (rhs, 1) == NULL_TREE + || integer_zerop (TREE_OPERAND (rhs, 1)))) + { + gcc_assert (TREE_CODE (TREE_TYPE (*type_ptr)) == POINTER_TYPE); + *type_ptr = TREE_TYPE (*type_ptr); + } + else + { + gimple *g; + gimple_seq ilist = NULL; + tree bias, t; + tree op = *op_ptr; + tree orig_type = *type_ptr; + tree orig_bias = TREE_PURPOSE (*dims_ptr); + bool by_ref = false; + + if (TREE_CODE (orig_bias) != INTEGER_CST) + orig_bias = lookup_decl (orig_bias, ctx); + + if (gimple_num_ops (stmt) == 2) + { + if (TREE_CODE (rhs) == ADDR_EXPR) + { + rhs = TREE_OPERAND (rhs, 0); + *dims_ptr = NULL_TREE; + } + + if (TREE_CODE (rhs) == ARRAY_REF + && TREE_CODE (TREE_OPERAND (rhs, 0)) == MEM_REF + && operand_equal_p (TREE_OPERAND (TREE_OPERAND (rhs, 0), 0), + *op_ptr, 0)) + { + bias = ncarray_create_bias (orig_bias, + TREE_TYPE (TREE_TYPE (orig_type))); + *type_ptr = TREE_TYPE (TREE_TYPE (orig_type)); + } + else if (TREE_CODE (rhs) == ARRAY_REF + && TREE_CODE (TREE_OPERAND (rhs, 0)) == VAR_DECL + && operand_equal_p (TREE_OPERAND (rhs, 0), *op_ptr, 0)) + { + tree ptr_type = build_pointer_type (orig_type); + op = create_tmp_var (ptr_type); + gimplify_assign (op, build_fold_addr_expr (TREE_OPERAND (rhs, 0)), + &ilist); + bias = ncarray_create_bias (orig_bias, TREE_TYPE (orig_type)); + *type_ptr = TREE_TYPE (orig_type); + orig_type = ptr_type; + by_ref = true; + } + else if (TREE_CODE (rhs) == MEM_REF + && operand_equal_p (*op_ptr, TREE_OPERAND (rhs, 0), 0) + && TREE_OPERAND (rhs, 1) != NULL_TREE) + { + bias = ncarray_create_bias (orig_bias, TREE_TYPE (orig_type)); + *type_ptr = TREE_TYPE (orig_type); + } + else if (TREE_CODE (lhs) == MEM_REF + && operand_equal_p (*op_ptr, TREE_OPERAND (lhs, 0), 0)) + { + if (*dims_ptr != NULL_TREE) + { + gcc_assert (TREE_CHAIN (*dims_ptr) == NULL_TREE); + bias = ncarray_create_bias (orig_bias, TREE_TYPE (orig_type)); + *type_ptr = TREE_TYPE (orig_type); + } + else + /* This should be the end of the non-contiguous array access + sequence. */ + return gsi; + } + else + gcc_unreachable (); + } + else if (gimple_num_ops (stmt) == 3 + && gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR + && operand_equal_p (*op_ptr, rhs, 0)) + { + bias = ncarray_create_bias (orig_bias, TREE_TYPE (orig_type)); + } + else + gcc_unreachable (); + + bias = fold_build1 (NEGATE_EXPR, sizetype, bias); + bias = fold_build2 (POINTER_PLUS_EXPR, orig_type, op, bias); + + t = create_tmp_var (by_ref ? build_pointer_type (orig_type) : orig_type); + + g = gimplify_assign (t, bias, &ilist); + gsi_insert_seq_before (&gsi, ilist, GSI_NEW_STMT); + *op_ptr = gimple_assign_lhs (g); + + if (by_ref) + *op_ptr = build2 (MEM_REF, TREE_TYPE (orig_type), *op_ptr, + build_int_cst (orig_type, 0)); + *dims_ptr = TREE_CHAIN (*dims_ptr); + } + + return gsi; +} + /* Callback for lower_omp_1. Return non-NULL if *tp needs to be regimplified. If DATA is non-NULL, lower_omp_1 is outside of OMP context, but with task_shared_vars set. */ @@ -12709,6 +13167,48 @@ lower_omp_1 (gimple_stmt_iterator *gsi_p, omp_cont default: regimplify: + /* If we detect the start of a non-contiguous array reference sequence, + scan and do the needed adjustments. */ + tree dims, *op_ptr; + omp_context *ncarray_ctx = ctx; + if (ncarray_ctx + && noncontig_array_reference_start (stmt, &ncarray_ctx, &op_ptr, &dims)) + { + bool started = false; + tree orig_array_var = *op_ptr; + tree curr_type = TREE_TYPE (orig_array_var); + + gimple_stmt_iterator gsi = *gsi_p, new_gsi; + while (op_ptr) + { + if (!is_gimple_assign (gsi_stmt (gsi)) + || ((gimple_assign_single_p (gsi_stmt (gsi)) + || gimple_assign_cast_p (gsi_stmt (gsi))) + && *op_ptr == gimple_assign_rhs1 (gsi_stmt (gsi)))) + break; + + new_gsi = ncarray_dimension_peel (ncarray_ctx, gsi, orig_array_var, + op_ptr, &curr_type, &dims); + if (!started) + { + /* Point 'stmt' to the start of the newly added + sequence. */ + started = true; + *gsi_p = new_gsi; + stmt = gsi_stmt (*gsi_p); + } + if (dims == NULL_TREE) + break; + + tree next_op = gimple_assign_lhs (gsi_stmt (gsi)); + do { + gsi_next (&gsi); + op_ptr = scan_for_reference (gsi_stmt (gsi), next_op); + } + while (!op_ptr); + } + } + if ((ctx || task_shared_vars) && walk_gimple_op (stmt, lower_omp_regimplify_p, ctx ? NULL : &wi)) Index: gcc/tree-pretty-print.c =================================================================== --- gcc/tree-pretty-print.c (revision 274618) +++ gcc/tree-pretty-print.c (working copy) @@ -849,6 +849,33 @@ dump_omp_clause (pretty_printer *pp, tree clause, case GOMP_MAP_LINK: pp_string (pp, "link"); break; + case GOMP_MAP_NONCONTIG_ARRAY_TO: + pp_string (pp, "to,noncontig_array"); + break; + case GOMP_MAP_NONCONTIG_ARRAY_FROM: + pp_string (pp, "from,noncontig_array"); + break; + case GOMP_MAP_NONCONTIG_ARRAY_TOFROM: + pp_string (pp, "tofrom,noncontig_array"); + break; + case GOMP_MAP_NONCONTIG_ARRAY_FORCE_TO: + pp_string (pp, "force_to,noncontig_array"); + break; + case GOMP_MAP_NONCONTIG_ARRAY_FORCE_FROM: + pp_string (pp, "force_from,noncontig_array"); + break; + case GOMP_MAP_NONCONTIG_ARRAY_FORCE_TOFROM: + pp_string (pp, "force_tofrom,noncontig_array"); + break; + case GOMP_MAP_NONCONTIG_ARRAY_ALLOC: + pp_string (pp, "alloc,noncontig_array"); + break; + case GOMP_MAP_NONCONTIG_ARRAY_FORCE_ALLOC: + pp_string (pp, "force_alloc,noncontig_array"); + break; + case GOMP_MAP_NONCONTIG_ARRAY_FORCE_PRESENT: + pp_string (pp, "force_present,noncontig_array"); + break; default: gcc_unreachable (); } @@ -870,6 +897,10 @@ dump_omp_clause (pretty_printer *pp, tree clause, case GOMP_MAP_TO_PSET: pp_string (pp, " [pointer set, len: "); break; + case GOMP_MAP_NONCONTIG_ARRAY: + gcc_assert (TREE_CODE (OMP_CLAUSE_SIZE (clause)) == TREE_LIST); + pp_string (pp, " [dimensions: "); + break; default: pp_string (pp, " [len: "); break; Index: include/gomp-constants.h =================================================================== --- include/gomp-constants.h (revision 274618) +++ include/gomp-constants.h (working copy) @@ -40,6 +40,7 @@ #define GOMP_MAP_FLAG_SPECIAL_0 (1 << 2) #define GOMP_MAP_FLAG_SPECIAL_1 (1 << 3) #define GOMP_MAP_FLAG_SPECIAL_2 (1 << 4) +#define GOMP_MAP_FLAG_SPECIAL_3 (1 << 5) #define GOMP_MAP_FLAG_SPECIAL (GOMP_MAP_FLAG_SPECIAL_1 \ | GOMP_MAP_FLAG_SPECIAL_0) /* Flag to force a specific behavior (or else, trigger a run-time error). */ @@ -127,6 +128,26 @@ enum gomp_map_kind /* Decrement usage count and deallocate if zero. */ GOMP_MAP_RELEASE = (GOMP_MAP_FLAG_SPECIAL_2 | GOMP_MAP_DELETE), + /* Mapping kinds for non-contiguous arrays. */ + GOMP_MAP_NONCONTIG_ARRAY = (GOMP_MAP_FLAG_SPECIAL_3), + GOMP_MAP_NONCONTIG_ARRAY_TO = (GOMP_MAP_NONCONTIG_ARRAY + | GOMP_MAP_TO), + GOMP_MAP_NONCONTIG_ARRAY_FROM = (GOMP_MAP_NONCONTIG_ARRAY + | GOMP_MAP_FROM), + GOMP_MAP_NONCONTIG_ARRAY_TOFROM = (GOMP_MAP_NONCONTIG_ARRAY + | GOMP_MAP_TOFROM), + GOMP_MAP_NONCONTIG_ARRAY_FORCE_TO = (GOMP_MAP_NONCONTIG_ARRAY_TO + | GOMP_MAP_FLAG_FORCE), + GOMP_MAP_NONCONTIG_ARRAY_FORCE_FROM = (GOMP_MAP_NONCONTIG_ARRAY_FROM + | GOMP_MAP_FLAG_FORCE), + GOMP_MAP_NONCONTIG_ARRAY_FORCE_TOFROM = (GOMP_MAP_NONCONTIG_ARRAY_TOFROM + | GOMP_MAP_FLAG_FORCE), + GOMP_MAP_NONCONTIG_ARRAY_ALLOC = (GOMP_MAP_NONCONTIG_ARRAY + | GOMP_MAP_ALLOC), + GOMP_MAP_NONCONTIG_ARRAY_FORCE_ALLOC = (GOMP_MAP_NONCONTIG_ARRAY + | GOMP_MAP_FORCE_ALLOC), + GOMP_MAP_NONCONTIG_ARRAY_FORCE_PRESENT = (GOMP_MAP_NONCONTIG_ARRAY + | GOMP_MAP_FORCE_PRESENT), /* Internal to GCC, not used in libgomp. */ /* Do not map, but pointer assign a pointer instead. */ @@ -155,6 +176,8 @@ enum gomp_map_kind #define GOMP_MAP_ALWAYS_P(X) \ (GOMP_MAP_ALWAYS_TO_P (X) || ((X) == GOMP_MAP_ALWAYS_FROM)) +#define GOMP_MAP_NONCONTIG_ARRAY_P(X) \ + ((X) & GOMP_MAP_NONCONTIG_ARRAY) /* Asynchronous behavior. Keep in sync with libgomp/{openacc.h,openacc.f90,openacc_lib.h}:acc_async_t. */ From patchwork Tue Aug 20 11:36:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chung-Lin Tang X-Patchwork-Id: 1150050 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-507367-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=mentor.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="cX18XuDT"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46CTKp6XV4z9s3Z for ; Tue, 20 Aug 2019 21:37:18 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :reply-to:from:to:subject:message-id:date:mime-version :content-type; q=dns; s=default; b=u9c5uKhcqYmGuvuChRYidH3IpWPe7 TEx8a4qy3uwEMPcPu3CriBY4ygZWtstdbreRkOhAUud/L52D5P8udwvQb7CnVPQg 5Qwutu3yPjBIS5tNtpYdmovys99JqI2/u8Jv189B1XQPv5TjPyuS/zjGwUjuSPFJ UpkScqlK//eF1o= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :reply-to:from:to:subject:message-id:date:mime-version :content-type; s=default; bh=t4679w9A6VRGbKU5N/UP++uJqeg=; b=cX1 8XuDToIvvO/6OzRNOfSlrjKZ2m5M+m2MHrjtcc849mI7lCDb0ApGTJo7HeCVx1M7 tTCldOGC1aNkujkFffgSWhFNTeVe+inbYyqUaoM0cvh44swyN7w5D1P1lLQ2Fcxp vYX2HevDlJQb0cEFtRSBBqUXV36niRBybX1Qf1Ms= Received: (qmail 125129 invoked by alias); 20 Aug 2019 11:37:10 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 125108 invoked by uid 89); 20 Aug 2019 11:37:09 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-12.1 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, SPF_PASS autolearn=ham version=3.3.1 spammy=Care, *****, 8616 X-HELO: esa1.mentor.iphmx.com Received: from esa1.mentor.iphmx.com (HELO esa1.mentor.iphmx.com) (68.232.129.153) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 20 Aug 2019 11:37:05 +0000 IronPort-SDR: kteyQ/XqI2ZDLIo9K0XgQZt+qSVssovYDRkZPsrYv4Jv/sfMpPUpu1ZuBiwrGwkuvhmjD38z01 wMznVsYtdyftUdCYGJaZcX6GOlGAM7wLAFWfAHp3jegNQtsRJTBoc3cGj6Dc/otTqfL7np4AqP t7q8EVZ1LzuL48Tgh5/UUpFwAfzwkVM664QRPmKNoRMnVXFs9lbieDw5eJWWSMrToDNQDg1E4I 8W07ORo8ue9ihWsBCqdM51W/ee4U5bCVRe7fK4sqaqajrhwuA2NA0043/5KVAUVcdGLRsb37Q3 XXs= Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa1.mentor.iphmx.com with ESMTP; 20 Aug 2019 03:37:03 -0800 IronPort-SDR: 7dGB9aIGAx98nzdRVUwDr/OLKoUnI9lJ/WN5qKPvaREBiZObdrx8+uPmCujjwFZvVjpN5EV/ek lA0VwaZyUCZSFNOR3qrb0dqCp2uMLG7j4sRY0SevlEAHiVFERy9vA+U1KVB6Q9d2d42ZOwD48c LUn7W+dPvo7Pn00qBG7ZXjGicCbB/iT8UWxzwfUzPhdku5DWKtP6YYByxLGzC7F4g+1HO2laxs ZGlj4lCNtB2xkvDHlRlNhORHWhkgwrWI+rW/HwAhDVjipF88q8CRrSexUdTLWAMnponoHIGO8v GcU= Reply-To: From: Chung-Lin Tang To: gcc-patches , Jakub Jelinek , Thomas Schwinge Subject: [PATCH, OpenACC, 3/3] Non-contiguous array support for OpenACC data clauses (re-submission), libgomp patches Message-ID: <5c0db7bd-093d-d406-eb73-b26bc7685a4d@mentor.com> Date: Tue, 20 Aug 2019 19:36:56 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 These are the libgomp patches (including testcases). Not much has changed from last submission besides renaming to 'non-contiguous', etc. and rebasing. Thanks, Chung-Lin libgomp/ * target.c (struct gomp_ncarray_dim): New struct declaration. (struct gomp_ncarray_descr_type): Likewise. (struct ncarray_info): Likewise. (gomp_noncontig_array_count_rows): New function. (gomp_noncontig_array_compute_info): Likewise. (gomp_noncontig_array_fill_rows_1): Likewise. (gomp_noncontig_array_fill_rows): Likewise. (gomp_noncontig_array_create_ptrblock): Likewise. (gomp_map_vars): Add code to handle non-contiguous array map kinds. * testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/noncontig_array-2.c: New test. * testsuite/libgomp.oacc-c-c++-common/noncontig_array-3.c: New test. * testsuite/libgomp.oacc-c-c++-common/noncontig_array-4.c: New test. * testsuite/libgomp.oacc-c-c++-common/noncontig_array-utils.h: New test. Index: libgomp/target.c =================================================================== --- libgomp/target.c (revision 274618) +++ libgomp/target.c (working copy) @@ -510,6 +510,151 @@ gomp_map_val (struct target_mem_desc *tgt, void ** return tgt->tgt_start + tgt->list[i].offset; } +/* Definitions for data structures describing non-contiguous arrays + (Note: interfaces with compiler) + + The compiler generates a descriptor for each such array, places the + descriptor on stack, and passes the address of the descriptor to the libgomp + runtime as a normal map argument. The runtime then processes the array + data structure setup, and replaces the argument with the new actual + array address for the child function. + + Care must be taken such that the struct field and layout assumptions + of struct gomp_ncarray_dim, gomp_ncarray_descr_type inside the compiler + be consistant with the below declarations. */ + +struct gomp_ncarray_dim { + size_t base; + size_t length; + size_t elem_size; + size_t is_array; +}; + +struct gomp_ncarray_descr_type { + void *ptr; + size_t ndims; + struct gomp_ncarray_dim dims[]; +}; + +/* Internal non-contiguous array info struct, used only here inside the runtime. */ + +struct ncarray_info +{ + struct gomp_ncarray_descr_type *descr; + size_t map_index; + size_t ptrblock_size; + size_t data_row_num; + size_t data_row_size; +}; + +static size_t +gomp_noncontig_array_count_rows (struct gomp_ncarray_descr_type *descr) +{ + size_t nrows = 1; + for (size_t d = 0; d < descr->ndims - 1; d++) + nrows *= descr->dims[d].length / sizeof (void *); + return nrows; +} + +static void +gomp_noncontig_array_compute_info (struct ncarray_info *nca) +{ + size_t d, n = 1; + struct gomp_ncarray_descr_type *descr = nca->descr; + + nca->ptrblock_size = 0; + for (d = 0; d < descr->ndims - 1; d++) + { + size_t dim_count = descr->dims[d].length / descr->dims[d].elem_size; + size_t dim_ptrblock_size = (descr->dims[d + 1].is_array + ? 0 : descr->dims[d].length * n); + nca->ptrblock_size += dim_ptrblock_size; + n *= dim_count; + } + nca->data_row_num = n; + nca->data_row_size = descr->dims[d].length; +} + +static void +gomp_noncontig_array_fill_rows_1 (struct gomp_ncarray_descr_type *descr, void *nca, + size_t d, void ***row_ptr, size_t *count) +{ + if (d < descr->ndims - 1) + { + size_t elsize = descr->dims[d].elem_size; + size_t n = descr->dims[d].length / elsize; + void *p = nca + descr->dims[d].base; + for (size_t i = 0; i < n; i++) + { + void *ptr = p + i * elsize; + /* Deref if next dimension is not array. */ + if (!descr->dims[d + 1].is_array) + ptr = *((void **) ptr); + gomp_noncontig_array_fill_rows_1 (descr, ptr, d + 1, row_ptr, count); + } + } + else + { + **row_ptr = nca + descr->dims[d].base; + *row_ptr += 1; + *count += 1; + } +} + +static size_t +gomp_noncontig_array_fill_rows (struct gomp_ncarray_descr_type *descr, void *rows[]) +{ + size_t count = 0; + void **p = rows; + gomp_noncontig_array_fill_rows_1 (descr, descr->ptr, 0, &p, &count); + return count; +} + +static void * +gomp_noncontig_array_create_ptrblock (struct ncarray_info *nca, + void *tgt_addr, void *tgt_data_rows[]) +{ + struct gomp_ncarray_descr_type *descr = nca->descr; + void *ptrblock = gomp_malloc (nca->ptrblock_size); + void **curr_dim_ptrblock = (void **) ptrblock; + size_t n = 1; + + for (size_t d = 0; d < descr->ndims - 1; d++) + { + int curr_dim_len = descr->dims[d].length; + int next_dim_len = descr->dims[d + 1].length; + int curr_dim_num = curr_dim_len / sizeof (void *); + + void *next_dim_ptrblock + = (void *)(curr_dim_ptrblock + n * curr_dim_num); + + for (int b = 0; b < n; b++) + for (int i = 0; i < curr_dim_num; i++) + { + if (d < descr->ndims - 2) + { + void *ptr = (next_dim_ptrblock + + b * curr_dim_num * next_dim_len + + i * next_dim_len); + void *tgt_ptr = tgt_addr + (ptr - ptrblock); + curr_dim_ptrblock[b * curr_dim_num + i] = tgt_ptr; + } + else + { + curr_dim_ptrblock[b * curr_dim_num + i] + = tgt_data_rows[b * curr_dim_num + i]; + } + void *addr = &curr_dim_ptrblock[b * curr_dim_num + i]; + assert (ptrblock <= addr && addr < ptrblock + nca->ptrblock_size); + } + + n *= curr_dim_num; + curr_dim_ptrblock = next_dim_ptrblock; + } + assert (n == nca->data_row_num); + return ptrblock; +} + static inline __attribute__((always_inline)) struct target_mem_desc * gomp_map_vars_internal (struct gomp_device_descr *devicep, struct goacc_asyncqueue *aq, size_t mapnum, @@ -523,9 +668,37 @@ gomp_map_vars_internal (struct gomp_device_descr * const int typemask = short_mapkind ? 0xff : 0x7; struct splay_tree_s *mem_map = &devicep->mem_map; struct splay_tree_key_s cur_node; - struct target_mem_desc *tgt - = gomp_malloc (sizeof (*tgt) + sizeof (tgt->list[0]) * mapnum); - tgt->list_count = mapnum; + struct target_mem_desc *tgt; + + bool process_noncontig_arrays = false; + size_t nca_data_row_num = 0, row_start = 0; + size_t nca_info_num = 0, nca_index; + struct ncarray_info *nca_info = NULL; + struct target_var_desc *row_desc; + uintptr_t target_row_addr; + void **host_data_rows = NULL, **target_data_rows = NULL; + void *row; + + if (mapnum > 0) + { + int kind = get_kind (short_mapkind, kinds, 0); + process_noncontig_arrays = GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask); + } + + if (process_noncontig_arrays) + for (i = 0; i < mapnum; i++) + { + int kind = get_kind (short_mapkind, kinds, i); + if (GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask)) + { + nca_data_row_num += gomp_noncontig_array_count_rows (hostaddrs[i]); + nca_info_num += 1; + } + } + + tgt = gomp_malloc (sizeof (*tgt) + + sizeof (tgt->list[0]) * (mapnum + nca_data_row_num)); + tgt->list_count = mapnum + nca_data_row_num; tgt->refcount = pragma_kind == GOMP_MAP_VARS_ENTER_DATA ? 0 : 1; tgt->device_descr = devicep; struct gomp_coalesce_buf cbuf, *cbufp = NULL; @@ -537,6 +710,14 @@ gomp_map_vars_internal (struct gomp_device_descr * return tgt; } + if (nca_info_num) + nca_info = gomp_alloca (sizeof (struct ncarray_info) * nca_info_num); + if (nca_data_row_num) + { + host_data_rows = gomp_malloc (sizeof (void *) * nca_data_row_num); + target_data_rows = gomp_malloc (sizeof (void *) * nca_data_row_num); + } + tgt_align = sizeof (void *); tgt_size = 0; cbuf.chunks = NULL; @@ -568,7 +749,7 @@ gomp_map_vars_internal (struct gomp_device_descr * return NULL; } - for (i = 0; i < mapnum; i++) + for (i = 0, nca_index = 0; i < mapnum; i++) { int kind = get_kind (short_mapkind, kinds, i); if (hostaddrs[i] == NULL @@ -633,6 +814,20 @@ gomp_map_vars_internal (struct gomp_device_descr * has_firstprivate = true; continue; } + else if (GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask)) + { + /* Ignore non-contiguous arrays for now, we process them together + later. */ + tgt->list[i].key = NULL; + tgt->list[i].offset = 0; + not_found_cnt++; + + struct ncarray_info *nca = &nca_info[nca_index++]; + nca->descr = (struct gomp_ncarray_descr_type *) hostaddrs[i]; + nca->map_index = i; + continue; + } + cur_node.host_start = (uintptr_t) hostaddrs[i]; if (!GOMP_MAP_POINTER_P (kind & typemask)) cur_node.host_end = cur_node.host_start + sizes[i]; @@ -701,6 +896,56 @@ gomp_map_vars_internal (struct gomp_device_descr * } } + /* For non-contiguous arrays. Each data row is one target item, separated + from the normal map clause items, hence we order them after mapnum. */ + if (process_noncontig_arrays) + for (i = 0, nca_index = 0, row_start = 0; i < mapnum; i++) + { + int kind = get_kind (short_mapkind, kinds, i); + if (!GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask)) + continue; + + struct ncarray_info *nca = &nca_info[nca_index++]; + struct gomp_ncarray_descr_type *descr = nca->descr; + size_t nr; + + gomp_noncontig_array_compute_info (nca); + + /* We have allocated space in host/target_data_rows to place all the + row data block pointers, now we can start filling them in. */ + nr = gomp_noncontig_array_fill_rows (descr, &host_data_rows[row_start]); + assert (nr == nca->data_row_num); + + size_t align = (size_t) 1 << (kind >> rshift); + if (tgt_align < align) + tgt_align = align; + tgt_size = (tgt_size + align - 1) & ~(align - 1); + tgt_size += nca->ptrblock_size; + + for (size_t j = 0; j < nca->data_row_num; j++) + { + row = host_data_rows[row_start + j]; + row_desc = &tgt->list[mapnum + row_start + j]; + + cur_node.host_start = (uintptr_t) row; + cur_node.host_end = cur_node.host_start + nca->data_row_size; + splay_tree_key n = splay_tree_lookup (mem_map, &cur_node); + if (n) + { + assert (n->refcount != REFCOUNT_LINK); + gomp_map_vars_existing (devicep, aq, n, &cur_node, row_desc, + kind & typemask, /* TODO: cbuf? */ NULL); + } + else + { + tgt_size = (tgt_size + align - 1) & ~(align - 1); + tgt_size += nca->data_row_size; + not_found_cnt++; + } + } + row_start += nca->data_row_num; + } + if (devaddrs) { if (mapnum != 1) @@ -861,6 +1106,15 @@ gomp_map_vars_internal (struct gomp_device_descr * default: break; } + + if (GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask)) + { + tgt->list[i].key = &array->key; + tgt->list[i].key->tgt = tgt; + array++; + continue; + } + splay_tree_key k = &array->key; k->host_start = (uintptr_t) hostaddrs[i]; if (!GOMP_MAP_POINTER_P (kind & typemask)) @@ -1010,8 +1264,115 @@ gomp_map_vars_internal (struct gomp_device_descr * array++; } } + + /* Processing of non-contiguous array rows. */ + if (process_noncontig_arrays) + { + for (i = 0, nca_index = 0, row_start = 0; i < mapnum; i++) + { + int kind = get_kind (short_mapkind, kinds, i); + if (!GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask)) + continue; + + struct ncarray_info *nca = &nca_info[nca_index++]; + assert (nca->descr == hostaddrs[i]); + + /* The map for the non-contiguous array itself is never copied from + during unmapping, its the data rows that count. Set copy-from + flags to false here. */ + tgt->list[i].copy_from = false; + tgt->list[i].always_copy_from = false; + + size_t align = (size_t) 1 << (kind >> rshift); + tgt_size = (tgt_size + align - 1) & ~(align - 1); + + /* For the map of the non-contiguous array itself, adjust so that + the passed device address points to the beginning of the + ptrblock. */ + tgt->list[i].key->tgt_offset = tgt_size; + + void *target_ptrblock = (void*) tgt->tgt_start + tgt_size; + tgt_size += nca->ptrblock_size; + + /* Add splay key for each data row in current non-contiguous + array. */ + for (size_t j = 0; j < nca->data_row_num; j++) + { + row = host_data_rows[row_start + j]; + row_desc = &tgt->list[mapnum + row_start + j]; + + cur_node.host_start = (uintptr_t) row; + cur_node.host_end = cur_node.host_start + nca->data_row_size; + splay_tree_key n = splay_tree_lookup (mem_map, &cur_node); + if (n) + { + assert (n->refcount != REFCOUNT_LINK); + gomp_map_vars_existing (devicep, aq, n, &cur_node, row_desc, + kind & typemask, cbufp); + target_row_addr = n->tgt->tgt_start + n->tgt_offset; + } + else + { + tgt->refcount++; + + splay_tree_key k = &array->key; + k->host_start = (uintptr_t) row; + k->host_end = k->host_start + nca->data_row_size; + + k->tgt = tgt; + k->refcount = 1; + k->link_key = NULL; + tgt_size = (tgt_size + align - 1) & ~(align - 1); + target_row_addr = tgt->tgt_start + tgt_size; + k->tgt_offset = tgt_size; + tgt_size += nca->data_row_size; + + row_desc->key = k; + row_desc->copy_from + = GOMP_MAP_COPY_FROM_P (kind & typemask); + row_desc->always_copy_from + = GOMP_MAP_COPY_FROM_P (kind & typemask); + row_desc->offset = 0; + row_desc->length = nca->data_row_size; + + array->left = NULL; + array->right = NULL; + splay_tree_insert (mem_map, array); + + if (GOMP_MAP_COPY_TO_P (kind & typemask)) + gomp_copy_host2dev (devicep, aq, + (void *) tgt->tgt_start + k->tgt_offset, + (void *) k->host_start, + nca->data_row_size, cbufp); + array++; + } + target_data_rows[row_start + j] = (void *) target_row_addr; + } + + /* Now we have the target memory allocated, and target offsets of all + row blocks assigned and calculated, we can construct the + accelerator side ptrblock and copy it in. */ + if (nca->ptrblock_size) + { + void *ptrblock = gomp_noncontig_array_create_ptrblock + (nca, target_ptrblock, target_data_rows + row_start); + gomp_copy_host2dev (devicep, aq, target_ptrblock, ptrblock, + nca->ptrblock_size, cbufp); + free (ptrblock); + } + + row_start += nca->data_row_num; + } + assert (row_start == nca_data_row_num && nca_index == nca_info_num); + } } + if (nca_data_row_num) + { + free (host_data_rows); + free (target_data_rows); + } + if (pragma_kind == GOMP_MAP_VARS_TARGET) { for (i = 0; i < mapnum; i++) Index: libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c =================================================================== --- libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c (nonexistent) +++ libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c (working copy) @@ -0,0 +1,103 @@ +/* { dg-do run { target { ! openacc_host_selected } } } */ + +#include +#include + +#define n 100 +#define m 100 + +int b[n][m]; + +void +test1 (void) +{ + int i, j, *a[100]; + + /* Array of pointers form test. */ + for (i = 0; i < n; i++) + { + a[i] = (int *)malloc (sizeof (int) * m); + for (j = 0; j < m; j++) + b[i][j] = j - i; + } + + #pragma acc parallel loop copyout(a[0:n][0:m]) copyin(b) + for (i = 0; i < n; i++) + #pragma acc loop + for (j = 0; j < m; j++) + a[i][j] = b[i][j]; + + for (i = 0; i < n; i++) + { + for (j = 0; j < m; j++) + assert (a[i][j] == b[i][j]); + /* Clean up. */ + free (a[i]); + } +} + +void +test2 (void) +{ + int i, j, **a = (int **) malloc (sizeof (int *) * n); + + /* Separately allocated blocks. */ + for (i = 0; i < n; i++) + { + a[i] = (int *)malloc (sizeof (int) * m); + for (j = 0; j < m; j++) + b[i][j] = j - i; + } + + #pragma acc parallel loop copyout(a[0:n][0:m]) copyin(b) + for (i = 0; i < n; i++) + #pragma acc loop + for (j = 0; j < m; j++) + a[i][j] = b[i][j]; + + for (i = 0; i < n; i++) + { + for (j = 0; j < m; j++) + assert (a[i][j] == b[i][j]); + /* Clean up. */ + free (a[i]); + } + free (a); +} + +void +test3 (void) +{ + int i, j, **a = (int **) malloc (sizeof (int *) * n); + a[0] = (int *) malloc (sizeof (int) * n * m); + + /* Rows allocated in one contiguous block. */ + for (i = 0; i < n; i++) + { + a[i] = *a + i * m; + for (j = 0; j < m; j++) + b[i][j] = j - i; + } + + #pragma acc parallel loop copyout(a[0:n][0:m]) copyin(b) + for (i = 0; i < n; i++) + #pragma acc loop + for (j = 0; j < m; j++) + a[i][j] = b[i][j]; + + for (i = 0; i < n; i++) + for (j = 0; j < m; j++) + assert (a[i][j] == b[i][j]); + + free (a[0]); + free (a); +} + +int +main (void) +{ + test1 (); + test2 (); + test3 (); + return 0; +} Index: libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-2.c =================================================================== --- libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-2.c (nonexistent) +++ libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-2.c (working copy) @@ -0,0 +1,37 @@ +/* { dg-do run { target { ! openacc_host_selected } } } */ + +#include +#include "noncontig_array-utils.h" + +int +main (void) +{ + int n = 10; + int ***a = (int ***) create_ncarray (sizeof (int), n, 3); + int ***b = (int ***) create_ncarray (sizeof (int), n, 3); + int ***c = (int ***) create_ncarray (sizeof (int), n, 3); + + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + { + a[i][j][k] = i + j * k + k; + b[i][j][k] = j + k * i + i * j; + c[i][j][k] = a[i][j][k]; + } + + #pragma acc parallel copy (a[0:n][0:n][0:n]) copyin (b[0:n][0:n][0:n]) + { + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + a[i][j][k] += b[k][j][i] + i + j + k; + } + + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + assert (a[i][j][k] == c[i][j][k] + b[k][j][i] + i + j + k); + + return 0; +} Index: libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-3.c =================================================================== --- libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-3.c (nonexistent) +++ libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-3.c (working copy) @@ -0,0 +1,45 @@ +/* { dg-do run { target { ! openacc_host_selected } } } */ + +#include +#include "noncontig_array-utils.h" + +int main (void) +{ + int n = 20, x = 5, y = 12; + int *****a = (int *****) create_ncarray (sizeof (int), n, 5); + + int sum1 = 0, sum2 = 0, sum3 = 0; + + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + for (int l = 0; l < n; l++) + for (int m = 0; m < n; m++) + { + a[i][j][k][l][m] = 1; + sum1++; + } + + #pragma acc parallel copy (a[x:y][x:y][x:y][x:y][x:y]) copy(sum2) + { + for (int i = x; i < x + y; i++) + for (int j = x; j < x + y; j++) + for (int k = x; k < x + y; k++) + for (int l = x; l < x + y; l++) + for (int m = x; m < x + y; m++) + { + a[i][j][k][l][m] = 0; + sum2++; + } + } + + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + for (int l = 0; l < n; l++) + for (int m = 0; m < n; m++) + sum3 += a[i][j][k][l][m]; + + assert (sum1 == sum2 + sum3); + return 0; +} Index: libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-4.c =================================================================== --- libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-4.c (nonexistent) +++ libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-4.c (working copy) @@ -0,0 +1,36 @@ +/* { dg-do run { target { ! openacc_host_selected } } } */ + +#include +#include "noncontig_array-utils.h" + +int main (void) +{ + int n = 128; + double ***a = (double ***) create_ncarray (sizeof (double), n, 3); + double ***b = (double ***) create_ncarray (sizeof (double), n, 3); + + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + a[i][j][k] = i + j + k + i * j * k; + + /* This test exercises async copyout of non-contiguous array rows. */ + #pragma acc parallel copyin(a[0:n][0:n][0:n]) copyout(b[0:n][0:n][0:n]) async(5) + { + #pragma acc loop gang + for (int i = 0; i < n; i++) + #pragma acc loop vector + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + b[i][j][k] = a[i][j][k] * 2.0; + } + + #pragma acc wait (5) + + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + assert (b[i][j][k] == a[i][j][k] * 2.0); + + return 0; +} Index: libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-utils.h =================================================================== --- libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-utils.h (nonexistent) +++ libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-utils.h (working copy) @@ -0,0 +1,44 @@ +#include +#include +#include +#include + +/* Allocate and create a pointer based NDIMS-dimensional array, + each dimension DIMLEN long, with ELSIZE sized data elements. */ +void * +create_ncarray (size_t elsize, int dimlen, int ndims) +{ + size_t blk_size = 0; + size_t n = 1; + + for (int i = 0; i < ndims - 1; i++) + { + n *= dimlen; + blk_size += sizeof (void *) * n; + } + size_t data_rows_num = n; + size_t data_rows_offset = blk_size; + blk_size += elsize * n * dimlen; + + void *blk = (void *) malloc (blk_size); + memset (blk, 0, blk_size); + void **curr_dim = (void **) blk; + n = 1; + + for (int d = 0; d < ndims - 1; d++) + { + uintptr_t next_dim = (uintptr_t) (curr_dim + n * dimlen); + size_t next_dimlen = dimlen * (d < ndims - 2 ? sizeof (void *) : elsize); + + for (int b = 0; b < n; b++) + for (int i = 0; i < dimlen; i++) + if (d < ndims - 1) + curr_dim[b * dimlen + i] + = (void*) (next_dim + b * dimlen * next_dimlen + i * next_dimlen); + + n *= dimlen; + curr_dim = (void**) next_dim; + } + assert (n == data_rows_num); + return blk; +}