From patchwork Wed May 31 19:03:06 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Nicolas Koenig X-Patchwork-Id: 769320 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3wdKd04sgHz9ryr for ; Thu, 1 Jun 2017 05:03:31 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="G90mYb05"; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:cc:references:from:message-id:date:mime-version :in-reply-to:content-type; q=dns; s=default; b=uxTyn8J+jaFHizJJm TiNf746OnwAPyh9wuBzZxYaVN4cdQxUeoqlltzwNcWGLRtNuBhcZ5fYvobnoOL1r W+Gp/Yk944A3xbiSvP668kv3pHW810mJ3JumvHLrrS5APgXs796t/vheCSsxO/Fh Iw0Je1xBoRvklMBlZiX+iFX+Hs= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:cc:references:from:message-id:date:mime-version :in-reply-to:content-type; s=default; bh=nSyS+Hk14IF3oriiOVqw5p0 E9gI=; b=G90mYb05qxgP+9kL1F0hqbCXoZlbQgKQ7ZXuorTkysVeMIf7aEd6vdr hgEmMslKK0MtZYS0axORiYHwAZyoOu3CziQ9ZKUTxfAABCYBuyGcEa39b/G2/Re/ 90LcKzFt5easTxFvKX+UFwCBLf/xytjn6MVqyen5TnzN8vZi/oUY= Received: (qmail 56978 invoked by alias); 31 May 2017 19:03:10 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 56955 invoked by uid 89); 31 May 2017 19:03:09 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-8.5 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY, LIKELY_SPAM_FROM, RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=HTo:D*fr, H*Ad:D*ch X-Spam-User: qpsmtpd, 2 recipients X-HELO: edge10.ethz.ch Received: from edge10.ethz.ch (HELO edge10.ethz.ch) (82.130.75.186) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 31 May 2017 19:03:06 +0000 Received: from CAS20.d.ethz.ch (172.31.51.110) by edge10.ethz.ch (82.130.75.186) with Microsoft SMTP Server (TLS) id 14.3.319.2; Wed, 31 May 2017 21:03:02 +0200 Received: from [192.168.0.227] (77.58.177.254) by mail.ethz.ch (172.31.51.110) with Microsoft SMTP Server (TLS) id 14.3.319.2; Wed, 31 May 2017 21:03:07 +0200 Subject: Re: [Patch, fortran] PR35339 Optimize implied do loops in io statements To: =?UTF-8?Q?Dominique_d'Humi=c3=a8res?= , "Bernhard Reutner-Fischer" CC: gfortran , gcc-patches References: <92F638BD-768E-4C83-9BA9-0FACFC7F6C07@lps.ens.fr> <53722520-52ae-0fa7-d732-f30995c9f93a@student.ethz.ch> <638D5570-F9B9-4522-A123-CDC0B95D4227@lps.ens.fr> From: Nicolas Koenig Message-ID: <84e2e0b8-26d2-bf0c-35ae-dd8f63a111fe@student.ethz.ch> Date: Wed, 31 May 2017 21:03:06 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: <638D5570-F9B9-4522-A123-CDC0B95D4227@lps.ens.fr> Hello Dominique, attached is the next try, this time without stupidities (I hope). Both test cases you posted don't ICE anymore. Ok for trunk? Nicolas Regression tested for x86_64-pc-linux-gnu. Changelog (still the same): 2017-05-27 Nicolas Koenig PR fortran/35339 * frontend-passes.c (traverse_io_block): New function. (simplify_io_impl_do): New function. (optimize_namespace): Invoke gfc_code_walker with simplify_io_impl_do. 2017-05-27 Nicolas Koenig PR fortran/35339 * gfortran.dg/implied_do_io_1.f90: New Test. On 05/31/2017 05:49 PM, Dominique d'Humières wrote: >> Le 31 mai 2017 à 17:40, Dominique d'Humières a écrit : >> >> If I am not mistaken, compiling the following code with the patch applied > simpler test > > print *,(huge(0),i=1,6) > ! print*,(i,i=1,6) > ! print*,(i,i=10000,60000,10000) > end > >> gives an ICE. >> >> TIA >> >> Dominique Index: frontend-passes.c =================================================================== --- frontend-passes.c (revision 248539) +++ frontend-passes.c (working copy) @@ -1060,6 +1060,257 @@ convert_elseif (gfc_code **c, int *walk_subtrees A return 0; } +struct do_stack +{ + struct do_stack *prev; + gfc_iterator *iter; + gfc_code *code; +} *stack_top; + +/* Recursivly traverse the block of a WRITE or READ statement, and, can it be + optimized, do so. It optimizes it by replacing do loops with their analog + array slices. For example: + + write (*,*) (a(i), i=1,4) + + is replaced with + + write (*,*) a(1:4:1) . */ + +static bool +traverse_io_block(gfc_code *code, bool *has_reached, gfc_code *prev) +{ + gfc_code *curr; + gfc_expr *new_e, *expr, *start; + gfc_ref *ref; + struct do_stack ds_push; + int i, future_rank = 0; + gfc_iterator *iters[GFC_MAX_DIMENSIONS]; + + /* Find the first transfer/do statement. */ + for (curr = code; curr; curr = curr->next) + { + if (curr->op == EXEC_DO || curr->op == EXEC_TRANSFER) + break; + } + + /* Ensure it is the only transfer/do statement because cases like + + write (*,*) (a(i), b(i), i=1,4) + + cannot be optimized. */ + + if (!curr || curr->next) + return false; + + if (curr->op == EXEC_DO) + { + if (curr->ext.iterator->var->ref) + return false; + ds_push.prev = stack_top; + ds_push.iter = curr->ext.iterator; + ds_push.code = curr; + stack_top = &ds_push; + if (traverse_io_block(curr->block->next, has_reached, prev)) + { + if (curr != stack_top->code && !*has_reached) + { + curr->block->next = NULL; + gfc_free_statements(curr); + } + else + *has_reached = true; + return true; + } + return false; + } + + gcc_assert(curr->op == EXEC_TRANSFER); + + ref = curr->expr1->ref; + if (!ref || ref->type != REF_ARRAY || ref->u.ar.codimen != 0 || ref->next) + return false; + + /* Find the iterators belonging to each variable and check conditions. */ + for (i = 0; i < ref->u.ar.dimen; i++) + { + if (!ref->u.ar.start[i] || ref->u.ar.start[i]->ref + || ref->u.ar.dimen_type[i] != DIMEN_ELEMENT) + return false; + + start = ref->u.ar.start[i]; + gfc_simplify_expr(start, 0); + switch (start->expr_type) + { + case EXPR_VARIABLE: + + /* write (*,*) (a(i), i=a%b,1) not handled yet. */ + if (start->ref) + return false; + + /* Check for (a(k), i=1,4) or ((a(j, i), i=1,4), j=1,4). */ + if (!stack_top || !stack_top->iter + || stack_top->iter->var->symtree != start->symtree) + iters[i] = NULL; + else + { + iters[i] = stack_top->iter; + stack_top = stack_top->prev; + future_rank++; + } + break; + case EXPR_CONSTANT: + iters[i] = NULL; + break; + case EXPR_OP: + switch (start->value.op.op) + { + case INTRINSIC_PLUS: + case INTRINSIC_TIMES: + if (start->value.op.op1->expr_type != EXPR_VARIABLE) + std::swap(start->value.op.op1, start->value.op.op2); + gcc_fallthrough(); + case INTRINSIC_MINUS: + if ((start->value.op.op1->expr_type!= EXPR_VARIABLE + && start->value.op.op2->expr_type != EXPR_CONSTANT) + || start->value.op.op1->ref) + return false; + if (!stack_top || !stack_top->iter + || stack_top->iter->var->symtree + != start->value.op.op1->symtree) + return false; + iters[i] = stack_top->iter; + stack_top = stack_top->prev; + break; + default: + return false; + } + future_rank++; + break; + default: + return false; + } + } + + /* Create new expr. */ + new_e = gfc_copy_expr(curr->expr1); + new_e->expr_type = EXPR_VARIABLE; + new_e->rank = future_rank; + if (curr->expr1->shape) + { + new_e->shape = gfc_get_shape(new_e->rank); + } + + + /* Assign new starts, ends and strides if necessary. */ + for (i = 0; i < ref->u.ar.dimen; i++) + { + if (!iters[i]) + continue; + start = ref->u.ar.start[i]; + switch (start->expr_type) + { + case EXPR_CONSTANT: + gfc_internal_error("bad expression"); + break; + case EXPR_VARIABLE: + new_e->ref->u.ar.dimen_type[i] = DIMEN_RANGE; + new_e->ref->u.ar.type = AR_SECTION; + gfc_free_expr(new_e->ref->u.ar.start[i]); + new_e->ref->u.ar.start[i] = gfc_copy_expr(iters[i]->start); + new_e->ref->u.ar.end[i] = gfc_copy_expr(iters[i]->end); + new_e->ref->u.ar.stride[i] = gfc_copy_expr(iters[i]->step); + break; + case EXPR_OP: + new_e->ref->u.ar.dimen_type[i] = DIMEN_RANGE; + new_e->ref->u.ar.type = AR_SECTION; + gfc_free_expr(new_e->ref->u.ar.start[i]); + expr = gfc_copy_expr(start); + expr->value.op.op1 = gfc_copy_expr(iters[i]->start); + new_e->ref->u.ar.start[i] = expr; + gfc_simplify_expr(new_e->ref->u.ar.start[i], 0); + expr = gfc_copy_expr(start); + expr->value.op.op1 = gfc_copy_expr(iters[i]->end); + new_e->ref->u.ar.end[i] = expr; + gfc_simplify_expr(new_e->ref->u.ar.end[i], 0); + switch(start->value.op.op) + { + case INTRINSIC_MINUS: + case INTRINSIC_PLUS: + new_e->ref->u.ar.stride[i] = gfc_copy_expr(iters[i]->step); + break; + case INTRINSIC_TIMES: + expr = gfc_copy_expr(start); + expr->value.op.op1 = gfc_copy_expr(iters[i]->step); + new_e->ref->u.ar.stride[i] = expr; + gfc_simplify_expr(new_e->ref->u.ar.stride[i], 0); + break; + default: + gfc_internal_error("bad op"); + } + break; + default: + gfc_internal_error("bad expression"); + } + } + curr->expr1 = new_e; + + /* Insert modified statement. Check whether the statement needs to be + inserted at the lowest level. */ + if (!stack_top->iter) + { + if (prev) + { + curr->next = prev->next->next; + prev->next = curr; + } + else + { + curr->next = stack_top->code->block->next->next->next; + stack_top->code->block->next = curr; + } + } + else + stack_top->code->block->next = curr; + return true; +} + +/* Function for the gfc_code_walker. If code is a READ or WRITE statement, it + tries to optimize its block. */ + +static int +simplify_io_impl_do (gfc_code **code, int *walk_subtrees, + void *data ATTRIBUTE_UNUSED) +{ + gfc_code **curr, *prev = NULL; + struct do_stack write, first; + bool b = false; + *walk_subtrees = 1; + if (!(*code)->block || ((*code)->block->op != EXEC_WRITE + && (*code)->block->op != EXEC_READ)) + return 0; + + *walk_subtrees = 0; + write.prev = NULL; + write.iter = NULL; + write.code = *code; + + for (curr = &(*code)->block; *curr; curr = &(*curr)->next) + { + if ((*curr)->op == EXEC_DO) + { + first.prev = &write; + first.iter = (*curr)->ext.iterator; + first.code = *curr; + stack_top = &first; + traverse_io_block((*curr)->block->next, &b, prev); + stack_top = NULL; + } + prev = *curr; + } + return 0; +} + /* Optimize a namespace, including all contained namespaces. */ static void @@ -1073,6 +1324,7 @@ optimize_namespace (gfc_namespace *ns) in_assoc_list = false; in_omp_workshare = false; + gfc_code_walker (&ns->code, simplify_io_impl_do, dummy_expr_callback, NULL); gfc_code_walker (&ns->code, convert_do_while, dummy_expr_callback, NULL); gfc_code_walker (&ns->code, convert_elseif, dummy_expr_callback, NULL); gfc_code_walker (&ns->code, cfe_code, cfe_expr_0, NULL);