From patchwork Tue Jun 22 20:17:24 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sebastian Pop X-Patchwork-Id: 56572 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 85B2AB6F19 for ; Wed, 23 Jun 2010 07:39:47 +1000 (EST) Received: (qmail 4459 invoked by alias); 22 Jun 2010 21:39:45 -0000 Received: (qmail 4448 invoked by uid 22791); 22 Jun 2010 21:39:42 -0000 X-SWARE-Spam-Status: No, hits=-1.6 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, TW_FC, TW_KN, TW_QM, TW_QS, TW_TM X-Spam-Check-By: sourceware.org Received: from mail-gw0-f47.google.com (HELO mail-gw0-f47.google.com) (74.125.83.47) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 22 Jun 2010 21:39:35 +0000 Received: by gwb19 with SMTP id 19so1238950gwb.20 for ; Tue, 22 Jun 2010 14:39:33 -0700 (PDT) Received: by 10.229.185.141 with SMTP id co13mr3560260qcb.241.1277241303841; Tue, 22 Jun 2010 14:15:03 -0700 (PDT) MIME-Version: 1.0 Received: by 10.224.74.18 with HTTP; Tue, 22 Jun 2010 13:17:24 -0700 (PDT) From: Sebastian Pop Date: Tue, 22 Jun 2010 15:17:24 -0500 Message-ID: Subject: [patch] if-conversion of loops with conditionals containing memory loads and stores To: GCC Patches , Richard Guenther X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi, The attached patches (on top of the previous patch set) are needed to if-convert the DCT loop kernel of FFmpeg. 0009 replaces the memory writes that are in predicated basic blocks with a full write: "if (cond) A[i] = foo" is replaced with "A[i] = cond ? foo : A[i]". In order to do this, the patch replaces the call to gimple_assign_rhs_could_trap_p with gimple_could_trap_p, as we now have to check that the LHS of assign stmts in conditions do not trap. In order to if-convert the DCT kernel: for (i = 0; i <= nCoeffs; i++) { level = block[i]; if (level) { if (level < 0) { level = level * qmul - qadd; } else { level = level * qmul + qadd; } block[i] = level; } } 0010 proves that the accesses to an array in a loop do not trap (more (than the original non-if-converted loop)) if they are executed at every iteration. In this DCT kernel, the read into block[i] happens at every iteration, and thus the if-converted writes into block[i] will not trap more than the original code. The attached patches are the same as previously posted, with the exception of this part from 0009: (if_convertible_gimple_assign_stmt_p): Do not handle types that do not match is_gimple_reg_type. + if (!is_gimple_reg_type (TREE_TYPE (lhs))) + return false; + this makes the patch pass with the extra checks added by Richi to avoid operands of the COND_EXPR to contain memory loads or stores. These two patches passed regstrap with BOOT_CFLAGS= -g -O3 -msse2 on amd64-linux on top of the 8 other patches submitted separately. Ok for trunk? Thanks, Sebastian Pop --- AMD / Open Source Compiler Engineering / GNU Tools From d260c90bc66d257ee5c410c05f713cb6ab0c268c Mon Sep 17 00:00:00 2001 From: Sebastian Pop Date: Thu, 20 May 2010 16:23:17 -0500 Subject: [PATCH 10/10] Do not check whether memory references accessed in every iteration trap. * gimple.c (gimple_op_could_trap_p): Outlined from gimple_could_trap_p_1. (gimple_could_trap_p_1): Call gimple_op_could_trap_p. * gimple.h (gimple_op_could_trap_p): Declared. * tree-data-ref.h (same_data_refs): New. * tree-if-conv.c (is_true_predicate): New. (is_predicated): Call is_true_predicate. (ifcvt_memrefs_will_trap): New. (operations_could_trap): New. (ifcvt_could_trap_p): New. (if_convertible_gimple_assign_stmt_p): Call ifcvt_could_trap_p. Gets a vector of data refs. (if_convertible_stmt_p): Same. (if_convertible_loop_p): Before freeing the data refs vector, pass it to if_convertible_stmt_p. * gcc.dg/tree-ssa/ifc-5.c: New. * gcc.dg/vect/vect-ifcvt-18.c: XFail-ed. * gcc.dg/vect/vect-cond-1.c: Update pattern. * gcc.dg/vect/vect-cond-3.c: Same. * gcc.dg/vect/vect-cond-4.c: Same. * gcc.dg/vect/vect-cond-6.c: Same. --- gcc/gimple.c | 38 +++++--- gcc/gimple.h | 1 + gcc/testsuite/gcc.dg/tree-ssa/ifc-5.c | 31 +++++++ gcc/testsuite/gcc.dg/vect/vect-cond-1.c | 2 +- gcc/testsuite/gcc.dg/vect/vect-cond-3.c | 2 +- gcc/testsuite/gcc.dg/vect/vect-cond-4.c | 2 +- gcc/testsuite/gcc.dg/vect/vect-cond-6.c | 1 + gcc/tree-data-ref.h | 24 +++++ gcc/tree-if-conv.c | 150 +++++++++++++++++++++++++------ 9 files changed, 206 insertions(+), 45 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ifc-5.c diff --git a/gcc/gimple.c b/gcc/gimple.c index 1a10f31..b2082a0 100644 --- a/gcc/gimple.c +++ b/gcc/gimple.c @@ -2365,24 +2365,14 @@ gimple_rhs_has_side_effects (const_gimple s) return false; } +/* Helper for gimple_could_trap_p_1. Return true if the operation of + S can trap. */ -/* Helper for gimple_could_trap_p and gimple_assign_rhs_could_trap_p. - Return true if S can trap. If INCLUDE_LHS is true and S is a - GIMPLE_ASSIGN, the LHS of the assignment is also checked. - Otherwise, only the RHS of the assignment is checked. */ - -static bool -gimple_could_trap_p_1 (gimple s, bool include_lhs) +bool +gimple_op_could_trap_p (gimple s) { - unsigned i, start; - tree t, div = NULL_TREE; enum tree_code op; - - start = (is_gimple_assign (s) && !include_lhs) ? 1 : 0; - - for (i = start; i < gimple_num_ops (s); i++) - if (tree_could_trap_p (gimple_op (s, i))) - return true; + tree t, div = NULL_TREE; switch (gimple_code (s)) { @@ -2411,7 +2401,25 @@ gimple_could_trap_p_1 (gimple s, bool include_lhs) } return false; +} + +/* Helper for gimple_could_trap_p and gimple_assign_rhs_could_trap_p. + Return true if S can trap. If INCLUDE_LHS is true and S is a + GIMPLE_ASSIGN, the LHS of the assignment is also checked. + Otherwise, only the RHS of the assignment is checked. */ + +static bool +gimple_could_trap_p_1 (gimple s, bool include_lhs) +{ + unsigned i, start; + + start = (is_gimple_assign (s) && !include_lhs) ? 1 : 0; + + for (i = start; i < gimple_num_ops (s); i++) + if (tree_could_trap_p (gimple_op (s, i))) + return true; + return gimple_op_could_trap_p (s); } diff --git a/gcc/gimple.h b/gcc/gimple.h index 210a622..371cede 100644 --- a/gcc/gimple.h +++ b/gcc/gimple.h @@ -879,6 +879,7 @@ gimple gimple_build_cond_from_tree (tree, tree, tree); void gimple_cond_set_condition_from_tree (gimple, tree); bool gimple_has_side_effects (const_gimple); bool gimple_rhs_has_side_effects (const_gimple); +bool gimple_op_could_trap_p (gimple); bool gimple_could_trap_p (gimple); bool gimple_assign_rhs_could_trap_p (gimple); void gimple_regimplify_operands (gimple, gimple_stmt_iterator *); diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-5.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-5.c new file mode 100644 index 0000000..2ca1ac9 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-5.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-options "-c -O2 -ftree-vectorize -fdump-tree-ifcvt-stats" { target *-*-* } } */ + +void +dct_unquantize_h263_inter_c (short *block, int n, int qscale, int nCoeffs) +{ + int i, level, qmul, qadd; + + qadd = (qscale - 1) | 1; + qmul = qscale << 1; + + for (i = 0; i <= nCoeffs; i++) + { + level = block[i]; + if (level) + { + if (level < 0) + { + level = level * qmul - qadd; + } + else + { + level = level * qmul + qadd; + } + block[i] = level; + } + } +} + +/* { dg-final { scan-tree-dump-times "Applying if-conversion" 1 "ifcvt" } } */ +/* { dg-final { cleanup-tree-dump "ifcvt" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/vect-cond-1.c b/gcc/testsuite/gcc.dg/vect/vect-cond-1.c index 4ee6713..888e5cb 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-cond-1.c +++ b/gcc/testsuite/gcc.dg/vect/vect-cond-1.c @@ -52,7 +52,7 @@ int main (void) return 0; } -/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail vect_no_align } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" } } */ /* { dg-final { cleanup-tree-dump "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/vect-cond-3.c b/gcc/testsuite/gcc.dg/vect/vect-cond-3.c index 56cfbb2..42b9f35 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-cond-3.c +++ b/gcc/testsuite/gcc.dg/vect/vect-cond-3.c @@ -60,7 +60,7 @@ int main (void) return 0; } -/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail vect_no_align } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" } } */ /* { dg-final { cleanup-tree-dump "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/vect-cond-4.c b/gcc/testsuite/gcc.dg/vect/vect-cond-4.c index c3a1585..1d1d8fc 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-cond-4.c +++ b/gcc/testsuite/gcc.dg/vect/vect-cond-4.c @@ -57,7 +57,7 @@ int main (void) return 0; } -/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail vect_no_align } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" } } */ /* { dg-final { cleanup-tree-dump "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/vect-cond-6.c b/gcc/testsuite/gcc.dg/vect/vect-cond-6.c index e5e9391..7820444 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-cond-6.c +++ b/gcc/testsuite/gcc.dg/vect/vect-cond-6.c @@ -56,5 +56,6 @@ int main () } /* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */ /* { dg-final { cleanup-tree-dump "vect" } } */ diff --git a/gcc/tree-data-ref.h b/gcc/tree-data-ref.h index eff5348..95a734a 100644 --- a/gcc/tree-data-ref.h +++ b/gcc/tree-data-ref.h @@ -417,6 +417,30 @@ extern void create_rdg_vertices (struct graph *, VEC (gimple, heap) *); extern bool dr_may_alias_p (const struct data_reference *, const struct data_reference *); +/* Return true when the data references A and B are accessing the same + memory object with the same access functions. */ + +static inline bool +same_data_refs (data_reference_p a, data_reference_p b) +{ + unsigned int i; + + /* The references are exactly the same. */ + if (operand_equal_p (DR_REF (a), DR_REF (b), 0)) + return true; + + if (DR_NUM_DIMENSIONS (a) != DR_NUM_DIMENSIONS (b) + || !dr_may_alias_p (a, b) + || !operand_equal_p (DR_BASE_OBJECT (a), DR_BASE_OBJECT (b), 0)) + return false; + + for (i = 0; i < DR_NUM_DIMENSIONS (a); i++) + if (!eq_evolutions_p (DR_ACCESS_FN (a, i), DR_ACCESS_FN (b, i))) + return false; + + return true; +} + /* Return true when the DDR contains two data references that have the same access functions. */ diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c index 3ae6c85..9ee276f 100644 --- a/gcc/tree-if-conv.c +++ b/gcc/tree-if-conv.c @@ -349,6 +349,96 @@ if_convertible_phi_p (struct loop *loop, basic_block bb, gimple phi) return true; } +/* Return true when the memory references of STMT will trap in the + if-converted code: i.e., when the data references of STMT are + executed unconditionally in the loop it is possible to say that in + the if-converted code there will not be more traps than in the + original non if-converted code. Supposing that the data refs are + accessed in basic blocks predicated differently, if it is possible + to prove that the same data references are executed at every + iteration, then the if-converted code won't have more traps than + the original code. */ + +static bool +ifcvt_memrefs_will_trap (gimple stmt, VEC (data_reference_p, heap) *refs) +{ + struct data_reference *a, *b; + unsigned int i, j; + tree cond = unfold_ssa_names (bb_predicate (gimple_bb (stmt)), 5); + + for (i = 0; VEC_iterate (data_reference_p, refs, i, a); i++) + if (DR_STMT (a) == stmt) + { + for (j = 0; VEC_iterate (data_reference_p, refs, j, b); j++) + if (i != j && same_data_refs (a, b)) + { + basic_block bb; + + if (!is_predicated (gimple_bb (DR_STMT (b)))) + { + cond = boolean_true_node; + break; + } + + bb = gimple_bb (DR_STMT (b)); + cond = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, cond, + unfold_ssa_names (bb_predicate (bb), 5)); + + if (is_true_predicate (cond)) + break; + } + + if (!is_true_predicate (cond)) + return true; + } + + return false; +} + +/* Same as gimple_could_trap_p_1, returns true when the statement S + could trap, but do not check the memory references. */ + +static bool +operations_could_trap (gimple s) +{ + unsigned i; + + for (i = 0; i < gimple_num_ops (s); i++) + { + tree expr = gimple_op (s, i); + + switch (TREE_CODE (expr)) + { + case ARRAY_REF: + case INDIRECT_REF: + case ALIGN_INDIRECT_REF: + case MISALIGNED_INDIRECT_REF: + break; + + default: + if (tree_could_trap_p (expr)) + return true; + } + } + + return gimple_op_could_trap_p (s); +} + +/* Wrapper around gimple_could_trap_p refined for the needs of the + if-conversion. Try to prove that the memory accesses of STMT could + not trap in the innermost loop containing STMT. */ + +static bool +ifcvt_could_trap_p (gimple stmt, VEC (data_reference_p, heap) *refs) +{ + if (gimple_has_mem_ops (stmt) + && !ifcvt_memrefs_will_trap (stmt, refs) + && !operations_could_trap (stmt)) + return false; + + return gimple_could_trap_p (stmt); +} + /* Return true when STMT is if-convertible. GIMPLE_ASSIGN statement is not if-convertible if, @@ -357,7 +447,8 @@ if_convertible_phi_p (struct loop *loop, basic_block bb, gimple phi) - LHS is not var decl. */ static bool -if_convertible_gimple_assign_stmt_p (gimple stmt) +if_convertible_gimple_assign_stmt_p (gimple stmt, + VEC (data_reference_p, heap) *refs) { tree lhs = gimple_assign_lhs (stmt); @@ -382,7 +473,7 @@ if_convertible_gimple_assign_stmt_p (gimple stmt) return false; } - if (gimple_could_trap_p (stmt)) + if (ifcvt_could_trap_p (stmt, refs)) { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "tree could trap...\n"); @@ -399,7 +490,7 @@ if_convertible_gimple_assign_stmt_p (gimple stmt) - it is a GIMPLE_LABEL or a GIMPLE_COND. */ static bool -if_convertible_stmt_p (gimple stmt) +if_convertible_stmt_p (gimple stmt, VEC (data_reference_p, heap) *refs) { switch (gimple_code (stmt)) { @@ -409,7 +500,7 @@ if_convertible_stmt_p (gimple stmt) return true; case GIMPLE_ASSIGN: - return if_convertible_gimple_assign_stmt_p (stmt); + return if_convertible_gimple_assign_stmt_p (stmt, refs); default: /* Don't know what to do with 'em so don't do anything. */ @@ -692,6 +783,9 @@ if_convertible_loop_p (struct loop *loop) edge e; edge_iterator ei; basic_block exit_bb = NULL; + bool res = false; + VEC (data_reference_p, heap) *refs; + VEC (ddr_p, heap) *ddrs; /* Handle only innermost loop. */ if (!loop || loop->inner) @@ -729,18 +823,14 @@ if_convertible_loop_p (struct loop *loop) /* Don't if-convert the loop when the data dependences cannot be computed: the loop won't be vectorized in that case. */ - { - VEC (data_reference_p, heap) *refs = VEC_alloc (data_reference_p, heap, 5); - VEC (ddr_p, heap) *ddrs = VEC_alloc (ddr_p, heap, 25); - bool res = compute_data_dependences_for_loop (loop, true, &refs, &ddrs); - - free_data_refs (refs); - free_dependence_relations (ddrs); + refs = VEC_alloc (data_reference_p, heap, 5); + ddrs = VEC_alloc (ddr_p, heap, 25); + res = compute_data_dependences_for_loop (loop, true, &refs, &ddrs); - if (!res) - return false; - } + if (!res) + goto cleanup; + res = false; calculate_dominance_info (CDI_DOMINATORS); /* Allow statements that can be handled during if-conversion. */ @@ -749,7 +839,7 @@ if_convertible_loop_p (struct loop *loop) { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "Irreducible loop\n"); - return false; + goto cleanup; } for (i = 0; i < loop->num_nodes; i++) @@ -757,14 +847,18 @@ if_convertible_loop_p (struct loop *loop) basic_block bb = ifc_bbs[i]; if (!if_convertible_bb_p (loop, bb, exit_bb)) - return false; + goto cleanup; if (bb_with_exit_edge_p (loop, bb)) exit_bb = bb; } - if (!predicate_bbs (loop)) - return false; + res = predicate_bbs (loop); + + if (!res) + goto cleanup; + + res = false; for (i = 0; i < loop->num_nodes; i++) { @@ -773,21 +867,23 @@ if_convertible_loop_p (struct loop *loop) for (itr = gsi_start_phis (bb); !gsi_end_p (itr); gsi_next (&itr)) if (!if_convertible_phi_p (loop, bb, gsi_stmt (itr))) - return false; + goto cleanup; - /* For non predicated BBs, don't check their statements. */ - if (!is_predicated (bb)) - continue; - - for (itr = gsi_start_bb (bb); !gsi_end_p (itr); gsi_next (&itr)) - if (!if_convertible_stmt_p (gsi_stmt (itr))) - return false; + /* Check the if-convertibility of statements in predicated BBs. */ + if (is_predicated (bb)) + for (itr = gsi_start_bb (bb); !gsi_end_p (itr); gsi_next (&itr)) + if (!if_convertible_stmt_p (gsi_stmt (itr), refs)) + goto cleanup; } + res = true; if (dump_file) fprintf (dump_file, "Applying if-conversion\n"); - return true; + cleanup: + free_data_refs (refs); + free_dependence_relations (ddrs); + return res; } /* Basic block BB has two predecessors. Using predecessor's bb -- 1.7.0.4