From patchwork Wed Oct 16 09:02:25 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 283887 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id B502F2C036A for ; Wed, 16 Oct 2013 20:02:37 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:in-reply-to:message-id:references :mime-version:content-type; q=dns; s=default; b=Na3A2oKUB0e2a5di RZMOmZo8znPb8BYkgnK8ocHFM/0uF1QD+DGpefrtec5tIFBuIjA/gnj/dH+eFsoR O5hrYEqmAXfcpGNAbPWzt5uShwoL8dBnaW6DC+FL7qfa35Yl/pDyukd/IWjLpNNQ toQ0I49rJWXwqAv/qILgKEdVfUA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:in-reply-to:message-id:references :mime-version:content-type; s=default; bh=k+ZxfiJvEd3ZA93i/lq1p1 AdCTw=; b=M4HMAeBkhFru6SMp8QTyDZ2cGT+KDND0+xN+YKi4zt6iUajD0BH3H8 kaAaDKRwuDKT/amHA8CmfG93U/t1IFMiJqVoKScLsWCzz/VLfQbmg9WjiMxY8fZy tkDSDl+/t4AcZv+Jj37oL40pSf+GjJ9cbOeNbxytThS3LKzISJQ60= Received: (qmail 5294 invoked by alias); 16 Oct 2013 09:02:30 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 5277 invoked by uid 89); 16 Oct 2013 09:02:29 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.7 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mx2.suse.de Received: from cantor2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 16 Oct 2013 09:02:27 +0000 Received: from relay1.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 7D2D0A5D7C; Wed, 16 Oct 2013 11:02:25 +0200 (CEST) Date: Wed, 16 Oct 2013 11:02:25 +0200 (CEST) From: Richard Biener To: Cong Hou Cc: Jeff Law , GCC Patches Subject: Re: [PATCH] Hoist loop invariant statements containing data refs with zero-step during loop-versioning in vectorization. In-Reply-To: Message-ID: References: <525D9871.2030304@redhat.com> User-Agent: Alpine 2.00 (LNX 1167 2008-08-23) MIME-Version: 1.0 On Tue, 15 Oct 2013, Cong Hou wrote: > Thank you for your reminder, Jeff! I just noticed Richard's comment. I > have modified the patch according to that. > > The new patch is attached. (posting patches inline is easier for review, now you have to deal with no quoting markers ;)) Comments inline. + { + tree op = gimple_op (stmt, i); + if (TREE_CODE (op) == INTEGER_CST + || TREE_CODE (op) == REAL_CST) There are other constants as well - just check if (is_gimple_min_invariant (op)) + continue; + if (TREE_CODE (op) == SSA_NAME) + { + gimple def = SSA_NAME_DEF_STMT (op); + if (def == stmt with starting at op 1 you can drop this. + || gimple_nop_p (def) + || !flow_bb_inside_loop_p (loop, gimple_bb (def))) + continue; + } Note that you fail to hoist if-converted code this way because op1 of name_1 = a_2 < b_4 ? x_5 : y_6; is 'a_2 < b_4', a tree expression and not an SSA name (ugh). Maybe we don't care (it's just a missed optimization), but if you are restricting yourself to hoisting assignments without a data-ref then you can walk over SSA uses on the stmt (instead of over gimple ops) with FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_USE) which would automagically take care of that case. Can you add a testcase which involves invariant if-conversion? Can you add a testcase with just an invariant store to make sure we don't wreck it? Can you add a testcase with an invariant store and load (the reduction case), too? + hoist = false; + break; + } For example you'll hoist all labels this way (no ops), as well as the loop exit GIMPLE_COND in case it's operands were loop invariant, breaking the CFG ... so please add && is_gimple_assign () to the if (!dr) check ;) + if (hoist) + { + gsi_remove (&si, false); + gsi_insert_on_edge_immediate (loop_preheader_edge (loop), stmt); + + if (dump_enabled_p ()) + { + dump_printf_loc + (MSG_NOTE, vect_location, + "hoist the statement to outside of the loop "); + dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt, 0); + dump_printf (MSG_NOTE, "\n"); + } + continue; + } + } + gsi_next (&si); + } + } + /* End loop-exit-fixes after versioning. */ if (cond_expr_stmt_list) > > thanks, > Cong > > > On Tue, Oct 15, 2013 at 12:33 PM, Jeff Law wrote: > > On 10/14/13 17:31, Cong Hou wrote: > >> > >> Any comment on this patch? > > > > Richi replied in the BZ you opened. > > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58508 > > > > Essentially he said emit the load on the edge rather than in the block > > itself. > > jeff > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 8a38316..2637309 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,8 @@ +2013-10-15 Cong Hou + + * tree-vect-loop-manip.c (vect_loop_versioning): Hoist loop invariant + statement that contains data refs with zero-step. + 2013-10-14 David Malcolm * dumpfile.h (gcc::dump_manager): New class, to hold state diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 075d071..9d0f4a5 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,7 @@ +2013-10-15 Cong Hou + + * gcc.dg/vect/pr58508.c: New test. + 2013-10-14 Tobias Burnus PR fortran/58658 diff --git a/gcc/testsuite/gcc.dg/vect/pr58508.c b/gcc/testsuite/gcc.dg/vect/pr58508.c new file mode 100644 index 0000000..cb22b50 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/pr58508.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */ + + +/* The GCC vectorizer generates loop versioning for the following loop + since there may exist aliasing between A and B. The predicate checks + if A may alias with B across all iterations. Then for the loop in + the true body, we can assert that *B is a loop invariant so that + we can hoist the load of *B before the loop body. */ + +void foo (int* a, int* b) +{ + int i; + for (i = 0; i < 100000; ++i) + a[i] = *b + 1; +} + + +/* { dg-final { scan-tree-dump-times "hoist" 2 "vect" } } */ +/* { dg-final { cleanup-tree-dump "vect" } } */ diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c index 574446a..f4fdec2 100644 --- a/gcc/tree-vect-loop-manip.c +++ b/gcc/tree-vect-loop-manip.c @@ -2477,6 +2477,92 @@ vect_loop_versioning (loop_vec_info loop_vinfo, adjust_phi_and_debug_stmts (orig_phi, e, PHI_RESULT (new_phi)); } Note that applying this kind of transform at this point invalidates some of the earlier analysis the vectorizer performed (namely the def-kind which now effectively gets vect_external_def from vect_internal_def). In this case it doesn't seem to cause any issues (we re-compute the def-kind everytime we need it (how wasteful)). + /* Extract load and store statements on pointers with zero-stride + accesses. */ + if (LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo)) + { + /* In the loop body, we iterate each statement to check if it is a load + or store. Then we check the DR_STEP of the data reference. If + DR_STEP is zero, then we will hoist the load statement to the loop + preheader, and move the store statement to the loop exit. */ We don't move the store yet. Micha has a patch pending that enables vectorization of zero-step stores. + for (gimple_stmt_iterator si = gsi_start_bb (loop->header); + !gsi_end_p (si);) While technically ok now (vectorized loops contain a single basic block) please use LOOP_VINFO_BBS () to get at the vector of basic-blcoks and iterate over them like other code does. + { + gimple stmt = gsi_stmt (si); + stmt_vec_info stmt_info = vinfo_for_stmt (stmt); + struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info); + + if (dr && integer_zerop (DR_STEP (dr))) + { + if (DR_IS_READ (dr)) + { + if (dump_enabled_p ()) + { + dump_printf_loc + (MSG_NOTE, vect_location, + "hoist the statement to outside of the loop "); "hoisting out of the vectorized loop: " + dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt, 0); + dump_printf (MSG_NOTE, "\n"); + } + + gsi_remove (&si, false); + gsi_insert_on_edge_immediate (loop_preheader_edge (loop), stmt); Note that this will result in a bogus VUSE on the stmt at this point which will be only fixed because of implementation details of loop versioning. Either get the correct VUSE from the loop header virtual PHI node preheader edge (if there is none then the current VUSE is the correct one to use) or clear it. + } + /* TODO: We also consider vectorizing loops containing zero-step + data refs as writes. For example: + + int a[N], *s; + for (i = 0; i < N; i++) + *s += a[i]; + + In this case the write to *s can be also moved after the + loop. */ Note that you then invalidate even more vectorizer analysis - you basically introduce a scalar reduction (you have to add a PHI node). Which means that the transform has to happen elsewhere. As Jakub now tries with if-conversion this would also be a candidate for applying the loop versioning before even starting the rest of the vectorizer analysis code. That said, I'd remove the TODO at this point because it's clearly not possible to implement just here ;) + continue; + } + else if (!dr) + { + bool hoist = true; + for (size_t i = 0; i < gimple_num_ops (stmt); i++) You are checking all kinds of statements, including assignments of which you are also checking the LHS ... restricting to assignments you can start walking at i = 1.