From patchwork Tue Oct 16 12:57:00 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chung-Lin Tang X-Patchwork-Id: 984749 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-487645-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=mentor.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="KRLEw91k"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42ZFhG6JHgz9s9G for ; Tue, 16 Oct 2018 23:57:18 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :reply-to:from:subject:to:message-id:date:mime-version :content-type; q=dns; s=default; b=o5VBMFSi2mrD6AHw/Pc388x5ZXeT5 ZMCtr3Q7SQRvpXp+vgcw7uOw8ou8djdD7gre1X7VT6GKlN1t9ChJPwZxNkajt8Wh u6NSBA570qErawgo3/48uJuznLOqByi6jErY1Je2RIaDu/uFqW89BIMHh8fw6cC9 7zrX7Z4GLDpNis= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :reply-to:from:subject:to:message-id:date:mime-version :content-type; s=default; bh=Qx66ls5+xZ+20GZ3Ezma+kTSUHY=; b=KRL Ew91kpTftk/q2S3sAuYJ1YIBefCeojVTuYb46xyOVuzFqVP2LyjWKbCy9VDPXzJd uVaBYIFV46wxaybYVa3brfOelODrpaU5bTkFwDPFdYb2pO+NjAuG5be0Xobrs7Lv 0Qz2QwTBP5D0pBq0zjXBD/0HcmLpm41v2a5kceLw= Received: (qmail 79880 invoked by alias); 16 Oct 2018 12:57:11 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 79852 invoked by uid 89); 16 Oct 2018 12:57:10 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-22.6 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS, UNWANTED_LANGUAGE_BODY autolearn=ham version=3.3.2 spammy=tgt, Processing, row, filling X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 16 Oct 2018 12:57:08 +0000 Received: from svr-orw-mbx-02.mgc.mentorg.com ([147.34.90.202]) by relay1.mentorg.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-SHA384:256) id 1gCOuJ-0007Lv-3J from ChungLin_Tang@mentor.com ; Tue, 16 Oct 2018 05:57:07 -0700 Received: from [0.0.0.0] (147.34.91.1) by svr-orw-mbx-02.mgc.mentorg.com (147.34.90.202) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Tue, 16 Oct 2018 05:57:04 -0700 Reply-To: From: Chung-Lin Tang Subject: [PATCH, OpenACC, 7/8] Multi-dimensional dynamic array support for OpenACC data clauses, libgomp support To: , Jakub Jelinek , Thomas Schwinge Message-ID: Date: Tue, 16 Oct 2018 20:57:00 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 This part is the libgomp runtime handling for OpenACC dynamic arrays. We handle such arrays by creating a "pointer block" that emulates the N-1 dimensions, and then treating each data row of the final Nth dimension as an individual object mapped in the TGT. All the rows are processed as appended after all the other map kind objects. Thanks, Chung-Lin libgomp/ * target.c (struct da_dim): New struct declaration. (struct da_descr_type): Likewise. (struct da_info): Likewise. (gomp_dynamic_array_count_rows): New function. (gomp_dynamic_array_compute_info): Likewise. (gomp_dynamic_array_fill_rows_1): Likewise. (gomp_dynamic_array_fill_rows): Likewise. (gomp_dynamic_array_create_ptrblock): Likewise. (gomp_map_vars): Add code to handle dynamic array map kinds. diff --git a/libgomp/target.c b/libgomp/target.c index 4c9fae0..071dc70 100644 --- a/libgomp/target.c +++ b/libgomp/target.c @@ -490,6 +490,140 @@ gomp_map_val (struct target_mem_desc *tgt, void **hostaddrs, size_t i) return tgt->tgt_start + tgt->list[i].offset; } +/* Dynamic array related data structures, interfaces with the compiler. */ + +struct da_dim { + size_t base; + size_t length; + size_t elem_size; + size_t is_array; +}; + +struct da_descr_type { + void *ptr; + size_t ndims; + struct da_dim dims[]; +}; + +/* Internal dynamic array info struct, used only here inside the runtime. */ + +struct da_info +{ + struct da_descr_type *descr; + size_t map_index; + size_t ptrblock_size; + size_t data_row_num; + size_t data_row_size; +}; + +static size_t +gomp_dynamic_array_count_rows (struct da_descr_type *descr) +{ + size_t nrows = 1; + for (size_t d = 0; d < descr->ndims - 1; d++) + nrows *= descr->dims[d].length / sizeof (void *); + return nrows; +} + +static void +gomp_dynamic_array_compute_info (struct da_info *da) +{ + size_t d, n = 1; + struct da_descr_type *descr = da->descr; + + da->ptrblock_size = 0; + for (d = 0; d < descr->ndims - 1; d++) + { + size_t dim_count = descr->dims[d].length / descr->dims[d].elem_size; + size_t dim_ptrblock_size = (descr->dims[d + 1].is_array + ? 0 : descr->dims[d].length * n); + da->ptrblock_size += dim_ptrblock_size; + n *= dim_count; + } + da->data_row_num = n; + da->data_row_size = descr->dims[d].length; +} + +static void +gomp_dynamic_array_fill_rows_1 (struct da_descr_type *descr, void *da, + size_t d, void ***row_ptr, size_t *count) +{ + if (d < descr->ndims - 1) + { + size_t elsize = descr->dims[d].elem_size; + size_t n = descr->dims[d].length / elsize; + void *p = da + descr->dims[d].base; + for (size_t i = 0; i < n; i++) + { + void *ptr = p + i * elsize; + /* Deref if next dimension is not array. */ + if (!descr->dims[d + 1].is_array) + ptr = *((void **) ptr); + gomp_dynamic_array_fill_rows_1 (descr, ptr, d + 1, row_ptr, count); + } + } + else + { + **row_ptr = da + descr->dims[d].base; + *row_ptr += 1; + *count += 1; + } +} + +static size_t +gomp_dynamic_array_fill_rows (struct da_descr_type *descr, void *rows[]) +{ + size_t count = 0; + void **p = rows; + gomp_dynamic_array_fill_rows_1 (descr, descr->ptr, 0, &p, &count); + return count; +} + +static void * +gomp_dynamic_array_create_ptrblock (struct da_info *da, + void *tgt_addr, void *tgt_data_rows[]) +{ + struct da_descr_type *descr = da->descr; + void *ptrblock = gomp_malloc (da->ptrblock_size); + void **curr_dim_ptrblock = (void **) ptrblock; + size_t n = 1; + + for (size_t d = 0; d < descr->ndims - 1; d++) + { + int curr_dim_len = descr->dims[d].length; + int next_dim_len = descr->dims[d + 1].length; + int curr_dim_num = curr_dim_len / sizeof (void *); + + void *next_dim_ptrblock + = (void *)(curr_dim_ptrblock + n * curr_dim_num); + + for (int b = 0; b < n; b++) + for (int i = 0; i < curr_dim_num; i++) + { + if (d < descr->ndims - 2) + { + void *ptr = (next_dim_ptrblock + + b * curr_dim_num * next_dim_len + + i * next_dim_len); + void *tgt_ptr = tgt_addr + (ptr - ptrblock); + curr_dim_ptrblock[b * curr_dim_num + i] = tgt_ptr; + } + else + { + curr_dim_ptrblock[b * curr_dim_num + i] + = tgt_data_rows[b * curr_dim_num + i]; + } + void *addr = &curr_dim_ptrblock[b * curr_dim_num + i]; + assert (ptrblock <= addr && addr < ptrblock + da->ptrblock_size); + } + + n *= curr_dim_num; + curr_dim_ptrblock = next_dim_ptrblock; + } + assert (n == da->data_row_num); + return ptrblock; +} + attribute_hidden struct target_mem_desc * gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum, void **hostaddrs, void **devaddrs, size_t *sizes, void *kinds, @@ -501,9 +635,29 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum, const int typemask = short_mapkind ? 0xff : 0x7; struct splay_tree_s *mem_map = &devicep->mem_map; struct splay_tree_key_s cur_node; - struct target_mem_desc *tgt - = gomp_malloc (sizeof (*tgt) + sizeof (tgt->list[0]) * mapnum); - tgt->list_count = mapnum; + struct target_mem_desc *tgt; + + size_t da_data_row_num = 0, row_start = 0; + size_t da_info_num = 0, da_index; + struct da_info *da_info = NULL; + struct target_var_desc *row_desc; + uintptr_t target_row_addr; + void **host_data_rows = NULL, **target_data_rows = NULL; + void *row; + + for (i = 0; i < mapnum; i++) + { + int kind = get_kind (short_mapkind, kinds, i); + if (GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask)) + { + da_data_row_num += gomp_dynamic_array_count_rows (hostaddrs[i]); + da_info_num += 1; + } + } + + tgt = gomp_malloc (sizeof (*tgt) + + sizeof (tgt->list[0]) * (mapnum + da_data_row_num)); + tgt->list_count = mapnum + da_data_row_num; tgt->refcount = pragma_kind == GOMP_MAP_VARS_ENTER_DATA ? 0 : 1; tgt->device_descr = devicep; struct gomp_coalesce_buf cbuf, *cbufp = NULL; @@ -515,6 +669,14 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum, return tgt; } + if (da_info_num) + da_info = gomp_alloca (sizeof (struct da_info) * da_info_num); + if (da_data_row_num) + { + host_data_rows = gomp_malloc (sizeof (void *) * da_data_row_num); + target_data_rows = gomp_malloc (sizeof (void *) * da_data_row_num); + } + tgt_align = sizeof (void *); tgt_size = 0; cbuf.chunks = NULL; @@ -546,7 +708,7 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum, return NULL; } - for (i = 0; i < mapnum; i++) + for (i = 0, da_index = 0; i < mapnum; i++) { int kind = get_kind (short_mapkind, kinds, i); if (hostaddrs[i] == NULL @@ -619,6 +781,20 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum, has_firstprivate = true; continue; } + else if (GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask)) + { + /* Ignore dynamic arrays for now, we process them together + later. */ + tgt->list[i].key = NULL; + tgt->list[i].offset = 0; + not_found_cnt++; + + struct da_info *da = &da_info[da_index++]; + da->descr = (struct da_descr_type *) hostaddrs[i]; + da->map_index = i; + continue; + } + cur_node.host_start = (uintptr_t) hostaddrs[i]; if (!GOMP_MAP_POINTER_P (kind & typemask)) cur_node.host_end = cur_node.host_start + sizes[i]; @@ -687,6 +863,55 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum, } } + /* For dynamic arrays. Each data row is one target item, separated from + the normal map clause items, hence we order them after mapnum. */ + for (i = 0, da_index = 0, row_start = 0; i < mapnum; i++) + { + int kind = get_kind (short_mapkind, kinds, i); + if (!GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask)) + continue; + + struct da_info *da = &da_info[da_index++]; + struct da_descr_type *descr = da->descr; + size_t nr; + + gomp_dynamic_array_compute_info (da); + + /* We have allocated space in host/target_data_rows to place all the + row data block pointers, now we can start filling them in. */ + nr = gomp_dynamic_array_fill_rows (descr, &host_data_rows[row_start]); + assert (nr == da->data_row_num); + + size_t align = (size_t) 1 << (kind >> rshift); + if (tgt_align < align) + tgt_align = align; + tgt_size = (tgt_size + align - 1) & ~(align - 1); + tgt_size += da->ptrblock_size; + + for (size_t j = 0; j < da->data_row_num; j++) + { + row = host_data_rows[row_start + j]; + row_desc = &tgt->list[mapnum + row_start + j]; + + cur_node.host_start = (uintptr_t) row; + cur_node.host_end = cur_node.host_start + da->data_row_size; + splay_tree_key n = splay_tree_lookup (mem_map, &cur_node); + if (n) + { + assert (n->refcount != REFCOUNT_LINK); + gomp_map_vars_existing (devicep, n, &cur_node, row_desc, + kind & typemask, /* TODO: cbuf? */ NULL); + } + else + { + tgt_size = (tgt_size + align - 1) & ~(align - 1); + tgt_size += da->data_row_size; + not_found_cnt++; + } + } + row_start += da->data_row_num; + } + if (devaddrs) { if (mapnum != 1) @@ -830,6 +1055,15 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum, default: break; } + + if (GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask)) + { + tgt->list[i].key = &array->key; + tgt->list[i].key->tgt = tgt; + array++; + continue; + } + splay_tree_key k = &array->key; k->host_start = (uintptr_t) hostaddrs[i]; if (!GOMP_MAP_POINTER_P (kind & typemask)) @@ -976,6 +1210,108 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum, array++; } } + + /* Processing of dynamic array rows. */ + for (i = 0, da_index = 0, row_start = 0; i < mapnum; i++) + { + int kind = get_kind (short_mapkind, kinds, i); + if (!GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask)) + continue; + + struct da_info *da = &da_info[da_index++]; + assert (da->descr == hostaddrs[i]); + + /* The map for the dynamic array itself is never copied from during + unmapping, its the data rows that count. Set copy from flags are + set to false here. */ + tgt->list[i].copy_from = false; + tgt->list[i].always_copy_from = false; + + size_t align = (size_t) 1 << (kind >> rshift); + tgt_size = (tgt_size + align - 1) & ~(align - 1); + + /* For the map of the dynamic array itself, adjust so that the passed + device address points to the beginning of the ptrblock. */ + tgt->list[i].key->tgt_offset = tgt_size; + + void *target_ptrblock = (void*) tgt->tgt_start + tgt_size; + tgt_size += da->ptrblock_size; + + /* Add splay key for each data row in current DA. */ + for (size_t j = 0; j < da->data_row_num; j++) + { + row = host_data_rows[row_start + j]; + row_desc = &tgt->list[mapnum + row_start + j]; + + cur_node.host_start = (uintptr_t) row; + cur_node.host_end = cur_node.host_start + da->data_row_size; + splay_tree_key n = splay_tree_lookup (mem_map, &cur_node); + if (n) + { + assert (n->refcount != REFCOUNT_LINK); + gomp_map_vars_existing (devicep, n, &cur_node, row_desc, + kind & typemask, cbufp); + target_row_addr = n->tgt->tgt_start + n->tgt_offset; + } + else + { + tgt->refcount++; + + splay_tree_key k = &array->key; + k->host_start = (uintptr_t) row; + k->host_end = k->host_start + da->data_row_size; + + k->tgt = tgt; + k->refcount = 1; + k->link_key = NULL; + tgt_size = (tgt_size + align - 1) & ~(align - 1); + target_row_addr = tgt->tgt_start + tgt_size; + k->tgt_offset = tgt_size; + tgt_size += da->data_row_size; + + row_desc->key = k; + row_desc->copy_from + = GOMP_MAP_COPY_FROM_P (kind & typemask); + row_desc->always_copy_from + = GOMP_MAP_COPY_FROM_P (kind & typemask); + row_desc->offset = 0; + row_desc->length = da->data_row_size; + + array->left = NULL; + array->right = NULL; + splay_tree_insert (mem_map, array); + + if (GOMP_MAP_COPY_TO_P (kind & typemask)) + gomp_copy_host2dev (devicep, + (void *) tgt->tgt_start + k->tgt_offset, + (void *) k->host_start, + da->data_row_size, cbufp); + array++; + } + target_data_rows[row_start + j] = (void *) target_row_addr; + } + + /* Now we have the target memory allocated, and target offsets of all + row blocks assigned and calculated, we can construct the + accelerator side ptrblock and copy it in. */ + if (da->ptrblock_size) + { + void *ptrblock = gomp_dynamic_array_create_ptrblock + (da, target_ptrblock, target_data_rows + row_start); + gomp_copy_host2dev (devicep, target_ptrblock, ptrblock, + da->ptrblock_size, cbufp); + free (ptrblock); + } + + row_start += da->data_row_num; + } + assert (row_start == da_data_row_num && da_index == da_info_num); + } + + if (da_data_row_num) + { + free (host_data_rows); + free (target_data_rows); } if (pragma_kind == GOMP_MAP_VARS_TARGET)