From patchwork Tue Jun 30 10:05:51 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tom de Vries X-Patchwork-Id: 489620 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 0FF5D1402C0 for ; Tue, 30 Jun 2015 20:06:27 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=Wbkiu63/; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; q=dns; s=default; b=GdCxMp1T29gZaERtP +tP5XjX0EUC3dY0bEhrh51fRlGkn35eOQnzYv668uOhE5f6LllHCKjkOhuntFxnZ ABvhacmWC9tUWCnpkid49DiLKeVLGW3SRbCWYmKvSrWZrL3LlaKeYneoExyH7ciB pJ51v4rd3vAL1bqd3xhFVaMyVc= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; s=default; bh=27iLziokD2Jrlu57JF6iKoN CDPo=; b=Wbkiu63/NoRS6cbipWz3mGQTle0W6I5ckk4AiOt6+hDAPL9pKeOc673 EzD1VmpqDnBEDtUthjyZopgwOPMQuFxtu/S454wc0ltyu4KKG3OZAJohE0Eoayon iBSYMq8XjhMEEOIY+LRIhrHzk67uLH/w21THQppn6sj9X9FL0OmQ= Received: (qmail 93061 invoked by alias); 30 Jun 2015 10:06:18 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 93047 invoked by uid 89); 30 Jun 2015 10:06:18 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.3 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 X-HELO: fencepost.gnu.org Received: from fencepost.gnu.org (HELO fencepost.gnu.org) (208.118.235.10) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Tue, 30 Jun 2015 10:06:16 +0000 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50982) by fencepost.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1Z9sQg-0007Vt-9a for gcc-patches@gnu.org; Tue, 30 Jun 2015 06:06:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z9sQa-0003u3-3m for gcc-patches@gnu.org; Tue, 30 Jun 2015 06:06:13 -0400 Received: from relay1.mentorg.com ([192.94.38.131]:62247) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z9sQZ-0003qO-Pv for gcc-patches@gnu.org; Tue, 30 Jun 2015 06:06:08 -0400 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-01.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1Z9sQS-00050z-Ed from Tom_deVries@mentor.com ; Tue, 30 Jun 2015 03:06:00 -0700 Received: from [127.0.0.1] (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server id 14.3.224.2; Tue, 30 Jun 2015 11:05:58 +0100 Message-ID: <559269FF.4070400@mentor.com> Date: Tue, 30 Jun 2015 12:05:51 +0200 From: Tom de Vries User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Richard Biener CC: "gcc-patches@gnu.org" Subject: Re: [PATCH, 2/2][PR66642] Add empty loop exit block in transform_to_exit_first_loop_alt References: <558BB12B.7060108@mentor.com> In-Reply-To: <558BB12B.7060108@mentor.com> X-detected-operating-system: by eggs.gnu.org: Windows NT kernel [generic] [fuzzy] X-Received-From: 192.94.38.131 On 25/06/15 09:43, Tom de Vries wrote: > Hi, > > I ran into a failure with parloops for reduction loop testcase > libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c. When we > exercise the low iteration count loop, the test-case fails. > > To understand the problem, let's first look at what happens when we use > transform_to_exit_first_loop (the original one) instead of > transform_to_exit_first_loop_alt (the alternative one, which is > currently used, and causing the failure). > > Before transform_to_exit_first_loop, the low iteration count loop and > the main loop share the loop exit block. After > transform_to_exit_first_loop, that's not the case anymore, the main loop > now has an exit block with a single predecessor. Subsequently, > separate_decls_in_region inserts code in the main loop exit block, which > is only triggered upon exit of the main loop. > > However, transform_to_exit_first_loop_alt does not insert such an exit > block, and the code inserted by separate_decls_in_region is also active > for the low iteration count loop, which results in an incorrect > reduction result when the low iteration count loop is used. > > > This patch fixes the problem by making sure > transform_to_exit_first_loop_alt adds a new exit block inbetween the > main loop header and the old exit block. > > Updated test-cases after commit of fix for PR66652, reposting. > Bootstrapped and reg-tested on x86_64. > > OK for trunk? > Thanks, - Tom Add empty loop exit block in transform_to_exit_first_loop_alt 2015-06-24 Tom de Vries PR tree-optimization/66642 * tree-parloops.c (transform_to_exit_first_loop_alt): Update function header comment. Rename split_edge variable to edge_at_split. Split exit edge to create new loop exit bb. Insert loop exit phis in new loop exit bb. * testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c (main): Test low iteration count case. * testsuite/libgomp.c/parloops-exit-first-loop-alt.c (init): New function, factor out of ... (main): ... here. Test low iteration count case. --- gcc/tree-parloops.c | 45 ++++++++++++++++------ .../libgomp.c/parloops-exit-first-loop-alt-3.c | 5 +++ .../libgomp.c/parloops-exit-first-loop-alt.c | 28 +++++++++++++- 3 files changed, 64 insertions(+), 14 deletions(-) diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c index 21ed17b..19c1aa5 100644 --- a/gcc/tree-parloops.c +++ b/gcc/tree-parloops.c @@ -1535,7 +1535,7 @@ replace_uses_in_bbs_by (tree name, tree val, bitmap bbs) goto : - sum_z = PHI + sum_z = PHI [1] Where is single_pred (bb latch); In the simplest case, that's . @@ -1562,14 +1562,17 @@ replace_uses_in_bbs_by (tree name, tree val, bitmap bbs) if (ivtmp_c < n + 1) goto ; else - goto ; + goto ; : ivtmp_b = ivtmp_a + 1; goto + : + sum_y = PHI + : - sum_z = PHI + sum_z = PHI In unified diff format: @@ -1606,9 +1609,12 @@ replace_uses_in_bbs_by (tree name, tree val, bitmap bbs) - goto + goto ++ : ++ sum_y = PHI + : -- sum_z = PHI -+ sum_z = PHI +- sum_z = PHI ++ sum_z = PHI Note: the example does not show any virtual phis, but these are handled more or less as reductions. @@ -1646,7 +1652,7 @@ transform_to_exit_first_loop_alt (struct loop *loop, /* Create the new_header block. */ basic_block new_header = split_block_before_cond_jump (exit->src); - edge split_edge = single_pred_edge (new_header); + edge edge_at_split = single_pred_edge (new_header); /* Redirect entry edge to new_header. */ edge entry = loop_preheader_edge (loop); @@ -1663,9 +1669,9 @@ transform_to_exit_first_loop_alt (struct loop *loop, e = redirect_edge_and_branch (post_cond_edge, header); gcc_assert (e == post_cond_edge); - /* Redirect split_edge to latch. */ - e = redirect_edge_and_branch (split_edge, latch); - gcc_assert (e == split_edge); + /* Redirect edge_at_split to latch. */ + e = redirect_edge_and_branch (edge_at_split, latch); + gcc_assert (e == edge_at_split); /* Set the new loop bound. */ gimple_cond_set_rhs (cond_stmt, bound); @@ -1718,21 +1724,36 @@ transform_to_exit_first_loop_alt (struct loop *loop, /* Set the latch arguments of the new phis to ivtmp/sum_b. */ flush_pending_stmts (post_inc_edge); - /* Register the reduction exit phis. */ + /* Create a new empty exit block, inbetween the new loop header and the old + exit block. The function separate_decls_in_region needs this block to + insert code that is active on loop exit, but not any other path. */ + basic_block new_exit_block = split_edge (exit); + + /* Insert and register the reduction exit phis. */ for (gphi_iterator gsi = gsi_start_phis (exit_block); !gsi_end_p (gsi); gsi_next (&gsi)) { gphi *phi = gsi.phi (); tree res_z = PHI_RESULT (phi); + + /* Now that we have a new exit block, duplicate the phi of the old exit + block in the new exit block to preserve loop-closed ssa. */ + edge succ_new_exit_block = single_succ_edge (new_exit_block); + edge pred_new_exit_block = single_pred_edge (new_exit_block); + tree res_y = copy_ssa_name (res_z, phi); + gphi *nphi = create_phi_node (res_y, new_exit_block); + tree res_c = PHI_ARG_DEF_FROM_EDGE (phi, succ_new_exit_block); + add_phi_arg (nphi, res_c, pred_new_exit_block, UNKNOWN_LOCATION); + add_phi_arg (phi, res_y, succ_new_exit_block, UNKNOWN_LOCATION); + if (virtual_operand_p (res_z)) continue; - tree res_c = PHI_ARG_DEF_FROM_EDGE (phi, exit); gimple reduc_phi = SSA_NAME_DEF_STMT (res_c); struct reduction_info *red = reduction_phi (reduction_list, reduc_phi); if (red != NULL) - red->keep_res = phi; + red->keep_res = nphi; } /* We're going to cancel the loop at the end of gen_parallel_loop, but until diff --git a/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c b/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c index 7de1377..78365e8 100644 --- a/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c +++ b/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c @@ -36,5 +36,10 @@ main (void) if (res != 11995) abort (); + /* Test low iteration count case. */ + res = f (10, a); + if (res != 25) + abort (); + return 0; } diff --git a/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt.c b/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt.c index 07468a9..a49454b 100644 --- a/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt.c +++ b/libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt.c @@ -22,8 +22,8 @@ f (unsigned int n, unsigned int *__restrict__ a, unsigned int *__restrict__ b, c[i] = a[i] + b[i]; } -int -main (void) +static void __attribute__((noclone,noinline)) +init (void) { int i, j; @@ -36,6 +36,14 @@ main (void) b[k] = (k * 3) % 7; c[k] = k * 2; } +} + +int +main (void) +{ + int i; + + init (); f (N, a, b, c); @@ -47,5 +55,21 @@ main (void) abort (); } + /* Test low iteration count case. */ + + init (); + + f (10, a, b, c); + + for (i = 0; i < N; i++) + { + unsigned int actual = c[i]; + unsigned int expected = (i < 10 + ? i + ((i * 3) % 7) + : i * 2); + if (actual != expected) + abort (); + } + return 0; } -- 1.9.1