From patchwork Sat Oct 8 14:32:17 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bill Schmidt X-Patchwork-Id: 118543 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id E0E73B70C2 for ; Sun, 9 Oct 2011 01:33:18 +1100 (EST) Received: (qmail 27536 invoked by alias); 8 Oct 2011 14:33:17 -0000 Received: (qmail 27528 invoked by uid 22791); 8 Oct 2011 14:33:15 -0000 X-SWARE-Spam-Status: No, hits=-2.3 required=5.0 tests=BAYES_00, RP_MATCHES_RCVD, TW_TM X-Spam-Check-By: sourceware.org Received: from e32.co.us.ibm.com (HELO e32.co.us.ibm.com) (32.97.110.150) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sat, 08 Oct 2011 14:32:47 +0000 Received: from /spool/local by us.ibm.com with XMail ESMTP for from ; Sat, 8 Oct 2011 08:32:39 -0600 Received: from d03relay05.boulder.ibm.com ([9.17.195.107]) by us.ibm.com ([192.168.1.132]) with XMail ESMTP; Sat, 8 Oct 2011 08:32:19 -0600 Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d03relay05.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p98EWJMr143868; Sat, 8 Oct 2011 08:32:19 -0600 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p98EWJlH000969; Sat, 8 Oct 2011 08:32:19 -0600 Received: from [9.57.69.73] ([9.57.69.73]) by d03av02.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id p98EWHNV000922; Sat, 8 Oct 2011 08:32:18 -0600 Subject: Re: [PATCH] Fix PR46556 (poor address generation) From: "William J. Schmidt" To: Richard Guenther Cc: gcc-patches@gcc.gnu.org, bergner@vnet.ibm.com, rguenth@gcc.gnu.org In-Reply-To: References: <1317831212.4199.45.camel@oc2474580526.ibm.com> <1317908597.16896.35.camel@gnopaine> <1317916185.16896.53.camel@gnopaine> Date: Sat, 08 Oct 2011 09:32:17 -0500 Message-ID: <1318084337.18738.12.camel@gnopaine> Mime-Version: 1.0 x-cbid: 11100814-3270-0000-0000-000000B6E868 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Greetings, Here are the revised changes for the tree portions of the patch. I've attempted to resolve all comments to date on those portions. Per Steven's comment, I moved copy_ref_info into tree-ssa-address.c; let me know if there's a better place, or whether you'd prefer to leave it where it was. I looked into changing the second reassoc pass to use a different pass_late_reassoc entry, but this impacted the test suite. There are about 20 tests that rely on -fdump-tree-reassoc being associated with two dump files named reassoc1 and reassoc2. Rather than change all these test cases for a temporary solution, I chose to use the deprecated first_pass_instance boolean to distinguish between the two passes. I marked this as a Bad Thing and it will be removed once I have time to work on the straight-line strength reducer. I looked into adding a test case with a negative offset, but was unable to come up with a construct that would have a negative offset on the base MEM_REF and still be recognized by this particular pattern matcher. In any case, the use of double_ints throughout should remove that concern. Thanks, Bill 2011-10-08 Bill Schmidt gcc: PR rtl-optimization/46556 * tree.h (copy_ref_info): Expose existing function. * tree-ssa-loop-ivopts.c (copy_ref_info): Move this function to... * tree-ssa-address.c (copy_ref_info): ...here, and remove static token. * tree-ssa-reassoc.c (restructure_base_and_offset): New function. (restructure_mem_ref): Likewise. (reassociate_bb): Look for opportunities to call restructure_mem_ref. gcc/testsuite: PR rtl-optimization/46556 * gcc.dg/tree-ssa/pr46556-1.c: New testcase. * gcc.dg/tree-ssa/pr46556-2.c: Likewise. * gcc.dg/tree-ssa/pr46556-3.c: Likewise. Index: gcc/tree.h =================================================================== --- gcc/tree.h (revision 179708) +++ gcc/tree.h (working copy) @@ -5777,6 +5777,7 @@ tree target_for_debug_bind (tree); /* In tree-ssa-address.c. */ extern tree tree_mem_ref_addr (tree, tree); extern void copy_mem_ref_info (tree, tree); +extern void copy_ref_info (tree, tree); /* In tree-vrp.c */ extern bool ssa_name_nonnegative_p (const_tree); Index: gcc/testsuite/gcc.dg/tree-ssa/pr46556-1.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/pr46556-1.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/pr46556-1.c (revision 0) @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-dom2" } */ + +struct x +{ + int a[16]; + int b[16]; + int c[16]; +}; + +extern void foo (int, int, int); + +void +f (struct x *p, unsigned int n) +{ + foo (p->a[n], p->c[n], p->b[n]); +} + +/* { dg-final { scan-tree-dump-times "\\* 4;" 1 "dom2" } } */ +/* { dg-final { scan-tree-dump-times "p_1\\(D\\) \\+ D" 1 "dom2" } } */ +/* { dg-final { scan-tree-dump-times "MEM\\\[\\(struct x \\*\\)D" 2 "dom2" } } */ +/* { dg-final { cleanup-tree-dump "dom2" } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/pr46556-2.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/pr46556-2.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/pr46556-2.c (revision 0) @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-dom2" } */ + +struct x +{ + int a[16]; + int b[16]; + int c[16]; +}; + +extern void foo (int, int, int); + +void +f (struct x *p, unsigned int n) +{ + foo (p->a[n], p->c[n], p->b[n]); + if (n > 12) + foo (p->a[n], p->c[n], p->b[n]); + else if (n > 3) + foo (p->b[n], p->a[n], p->c[n]); +} + +/* { dg-final { scan-tree-dump-times "\\* 4;" 1 "dom2" } } */ +/* { dg-final { scan-tree-dump-times "p_1\\(D\\) \\+ D" 1 "dom2" } } */ +/* { dg-final { scan-tree-dump-times "MEM\\\[\\(struct x \\*\\)D" 6 "dom2" } } */ +/* { dg-final { cleanup-tree-dump "dom2" } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/pr46556-3.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/pr46556-3.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/pr46556-3.c (revision 0) @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-dom2" } */ + +struct x +{ + int a[16]; + int b[16]; + int c[16]; +}; + +extern void foo (int, int, int); + +void +f (struct x *p, unsigned int n) +{ + foo (p->a[n], p->c[n], p->b[n]); + if (n > 3) + { + foo (p->a[n], p->c[n], p->b[n]); + if (n > 12) + foo (p->b[n], p->a[n], p->c[n]); + } +} + +/* { dg-final { scan-tree-dump-times "\\* 4;" 1 "dom2" } } */ +/* { dg-final { scan-tree-dump-times "p_1\\(D\\) \\+ D" 1 "dom2" } } */ +/* { dg-final { scan-tree-dump-times "MEM\\\[\\(struct x \\*\\)D" 6 "dom2" } } */ +/* { dg-final { cleanup-tree-dump "dom2" } } */ Index: gcc/tree-ssa-loop-ivopts.c =================================================================== --- gcc/tree-ssa-loop-ivopts.c (revision 179708) +++ gcc/tree-ssa-loop-ivopts.c (working copy) @@ -6278,64 +6278,6 @@ rewrite_use_nonlinear_expr (struct ivopts_data *da } } -/* Copies the reference information from OLD_REF to NEW_REF. */ - -static void -copy_ref_info (tree new_ref, tree old_ref) -{ - tree new_ptr_base = NULL_TREE; - - TREE_SIDE_EFFECTS (new_ref) = TREE_SIDE_EFFECTS (old_ref); - TREE_THIS_VOLATILE (new_ref) = TREE_THIS_VOLATILE (old_ref); - - new_ptr_base = TREE_OPERAND (new_ref, 0); - - /* We can transfer points-to information from an old pointer - or decl base to the new one. */ - if (new_ptr_base - && TREE_CODE (new_ptr_base) == SSA_NAME - && !SSA_NAME_PTR_INFO (new_ptr_base)) - { - tree base = get_base_address (old_ref); - if (!base) - ; - else if ((TREE_CODE (base) == MEM_REF - || TREE_CODE (base) == TARGET_MEM_REF) - && TREE_CODE (TREE_OPERAND (base, 0)) == SSA_NAME - && SSA_NAME_PTR_INFO (TREE_OPERAND (base, 0))) - { - struct ptr_info_def *new_pi; - duplicate_ssa_name_ptr_info - (new_ptr_base, SSA_NAME_PTR_INFO (TREE_OPERAND (base, 0))); - new_pi = SSA_NAME_PTR_INFO (new_ptr_base); - /* We have to be careful about transfering alignment information. */ - if (TREE_CODE (old_ref) == MEM_REF - && !(TREE_CODE (new_ref) == TARGET_MEM_REF - && (TMR_INDEX2 (new_ref) - || (TMR_STEP (new_ref) - && (TREE_INT_CST_LOW (TMR_STEP (new_ref)) - < new_pi->align))))) - { - new_pi->misalign += double_int_sub (mem_ref_offset (old_ref), - mem_ref_offset (new_ref)).low; - new_pi->misalign &= (new_pi->align - 1); - } - else - { - new_pi->align = 1; - new_pi->misalign = 0; - } - } - else if (TREE_CODE (base) == VAR_DECL - || TREE_CODE (base) == PARM_DECL - || TREE_CODE (base) == RESULT_DECL) - { - struct ptr_info_def *pi = get_ptr_info (new_ptr_base); - pt_solution_set_var (&pi->pt, base); - } - } -} - /* Performs a peephole optimization to reorder the iv update statement with a mem ref to enable instruction combining in later phases. The mem ref uses the iv value before the update, so the reordering transformation requires Index: gcc/tree-ssa-address.c =================================================================== --- gcc/tree-ssa-address.c (revision 179708) +++ gcc/tree-ssa-address.c (working copy) @@ -832,6 +832,64 @@ copy_mem_ref_info (tree to, tree from) TREE_THIS_VOLATILE (to) = TREE_THIS_VOLATILE (from); } +/* Copies the reference information from OLD_REF to NEW_REF. */ + +void +copy_ref_info (tree new_ref, tree old_ref) +{ + tree new_ptr_base = NULL_TREE; + + TREE_SIDE_EFFECTS (new_ref) = TREE_SIDE_EFFECTS (old_ref); + TREE_THIS_VOLATILE (new_ref) = TREE_THIS_VOLATILE (old_ref); + + new_ptr_base = TREE_OPERAND (new_ref, 0); + + /* We can transfer points-to information from an old pointer + or decl base to the new one. */ + if (new_ptr_base + && TREE_CODE (new_ptr_base) == SSA_NAME + && !SSA_NAME_PTR_INFO (new_ptr_base)) + { + tree base = get_base_address (old_ref); + if (!base) + ; + else if ((TREE_CODE (base) == MEM_REF + || TREE_CODE (base) == TARGET_MEM_REF) + && TREE_CODE (TREE_OPERAND (base, 0)) == SSA_NAME + && SSA_NAME_PTR_INFO (TREE_OPERAND (base, 0))) + { + struct ptr_info_def *new_pi; + duplicate_ssa_name_ptr_info + (new_ptr_base, SSA_NAME_PTR_INFO (TREE_OPERAND (base, 0))); + new_pi = SSA_NAME_PTR_INFO (new_ptr_base); + /* We have to be careful about transfering alignment information. */ + if (TREE_CODE (old_ref) == MEM_REF + && !(TREE_CODE (new_ref) == TARGET_MEM_REF + && (TMR_INDEX2 (new_ref) + || (TMR_STEP (new_ref) + && (TREE_INT_CST_LOW (TMR_STEP (new_ref)) + < new_pi->align))))) + { + new_pi->misalign += double_int_sub (mem_ref_offset (old_ref), + mem_ref_offset (new_ref)).low; + new_pi->misalign &= (new_pi->align - 1); + } + else + { + new_pi->align = 1; + new_pi->misalign = 0; + } + } + else if (TREE_CODE (base) == VAR_DECL + || TREE_CODE (base) == PARM_DECL + || TREE_CODE (base) == RESULT_DECL) + { + struct ptr_info_def *pi = get_ptr_info (new_ptr_base); + pt_solution_set_var (&pi->pt, base); + } + } +} + /* Move constants in target_mem_ref REF to offset. Returns the new target mem ref if anything changes, NULL_TREE otherwise. */ Index: gcc/tree-ssa-reassoc.c =================================================================== --- gcc/tree-ssa-reassoc.c (revision 179708) +++ gcc/tree-ssa-reassoc.c (working copy) @@ -2815,6 +2815,121 @@ break_up_subtract_bb (basic_block bb) break_up_subtract_bb (son); } +/* Given BASE and OFFSET derived from *EXPR occurring in stmt *GSI, + determine whether there is a profitable opportunity to restructure + address arithmetic within BASE and OFFSET. If so, produce such + a restructuring and return it. */ + +static tree +restructure_base_and_offset (tree *expr, gimple_stmt_iterator *gsi, + tree base, tree offset, HOST_WIDE_INT bitpos) +{ + tree mult_op0, mult_op1, t1, t2, offset_type, mult_expr, add_expr, mem_ref; + double_int c, c1, c2, c3, c4; + + /* Look for the following pattern: + + base = MEM_REF (T1, C1); + offset = MULT_EXPR (PLUS_EXPR (T2, C2), C3) + bitpos = C4 * BITS_PER_UNIT + + If found, convert it to: + + MEM_REF (POINTER_PLUS_EXPR (T1, MULT_EXPR (T2, C3)), + C1 + (C2 * C3) + C4) + */ + if (!base + || !offset + || TREE_CODE (base) != MEM_REF + || TREE_CODE (TREE_OPERAND (base, 1)) != INTEGER_CST + || TREE_CODE (offset) != MULT_EXPR) + return NULL_TREE; + + mult_op0 = TREE_OPERAND (offset, 0); + mult_op1 = TREE_OPERAND (offset, 1); + + if (TREE_CODE (mult_op0) != PLUS_EXPR + || TREE_CODE (mult_op1) != INTEGER_CST + || TREE_CODE (TREE_OPERAND (mult_op0, 1)) != INTEGER_CST) + return NULL_TREE; + + t1 = TREE_OPERAND (base, 0); + t2 = TREE_OPERAND (mult_op0, 0); + c1 = TREE_INT_CST (TREE_OPERAND (base, 1)); + c2 = TREE_INT_CST (TREE_OPERAND (mult_op0, 1)); + c3 = TREE_INT_CST (mult_op1); + c4 = uhwi_to_double_int (bitpos / BITS_PER_UNIT); + c = double_int_add (double_int_add (c1, double_int_mul (c2, c3)), c4); + + if (!double_int_fits_in_shwi_p (c) + || !double_int_fits_in_shwi_p (c3)) + return NULL_TREE; + + offset_type = TREE_TYPE (TREE_OPERAND (base, 1)); + + mult_expr = fold_build2 (MULT_EXPR, sizetype, t2, + build_int_cst (sizetype, double_int_to_shwi (c3))); + mult_expr = force_gimple_operand_gsi (gsi, mult_expr, true, NULL, + true, GSI_SAME_STMT); + add_expr = fold_build2 (POINTER_PLUS_EXPR, TREE_TYPE (t1), t1, mult_expr); + add_expr = force_gimple_operand_gsi (gsi, add_expr, true, NULL, + true, GSI_SAME_STMT); + mem_ref = fold_build2 (MEM_REF, TREE_TYPE (*expr), add_expr, + build_int_cst (offset_type, double_int_to_shwi (c))); + return mem_ref; +} + +/* If *EXPR contains a memory reference, determine whether there is an + opportunity to restructure the base and offset expressions for + improved performance. */ + +static bool +restructure_mem_ref (tree *expr, gimple_stmt_iterator *gsi) +{ + enum tree_code code = TREE_CODE (*expr); + tree base, offset, mem_ref; + HOST_WIDE_INT bitsize, bitpos; + enum machine_mode mode; + int unsignedp, volatilep; + + /* Only expressions that reference memory are of interest. Bitfield + references should eventually be lowered prior to this point and + are not dealt with. Skip BLKmode aggregates as well. */ + if (! handled_component_p (*expr) + || code == BIT_FIELD_REF + || (code == COMPONENT_REF && DECL_BIT_FIELD (TREE_OPERAND (*expr, 1))) + || TYPE_MODE (TREE_TYPE (*expr)) == BLKmode) + return false; + + /* Determine whether the expression can be represented with base and + offset components. */ + base = get_inner_reference (*expr, &bitsize, &bitpos, &offset, &mode, + &unsignedp, &volatilep, false); + if (!base || !offset) + return false; + + /* Look for a restructuring opportunity. */ + if ((mem_ref = restructure_base_and_offset (expr, gsi, base, + offset, bitpos)) == NULL_TREE) + return false; + + /* Found one. Record the replacement in the dump file and complete + the replacement. */ + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "\nOriginal_expr = "); + print_generic_expr (dump_file, *expr, TDF_VOPS|TDF_MEMSYMS); + fprintf (dump_file, "\nmem_ref = "); + print_generic_expr (dump_file, mem_ref, TDF_VOPS|TDF_MEMSYMS); + fprintf (dump_file, "\n\n"); + } + + copy_ref_info (mem_ref, *expr); + *expr = mem_ref; + + return true; +} + /* Reassociate expressions in basic block BB and its post-dominator as children. */ @@ -2823,14 +2938,43 @@ reassociate_bb (basic_block bb) { gimple_stmt_iterator gsi; basic_block son; + bool in_loop = loop_depth (bb->loop_father) != 0; for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi); gsi_prev (&gsi)) { gimple stmt = gsi_stmt (gsi); - if (is_gimple_assign (stmt) - && !stmt_could_throw_p (stmt)) + /* During late reassociation only, look for restructuring + opportunities within an expression that references memory. + We only do this for blocks not contained in loops, since the + ivopts machinery does a good job on loop expressions, and we + don't want to interfere with other loop optimizations. */ + /* TODO: This belongs more properly in a separate pass that + performs general strength reduction on straight-line code. + Eventually move this there. */ + if (!first_pass_instance /* TODO: deprecated */ + && !in_loop + && gimple_vuse (stmt) + && gimple_assign_single_p (stmt)) { + tree *lhs, *rhs; + if (gimple_vdef (stmt)) + { + lhs = gimple_assign_lhs_ptr (stmt); + if (restructure_mem_ref (lhs, &gsi)) + update_stmt (stmt); + } + else if (gimple_vuse (stmt)) + { + rhs = gimple_assign_rhs1_ptr (stmt); + if (restructure_mem_ref (rhs, &gsi)) + update_stmt (stmt); + } + } + + else if (is_gimple_assign (stmt) + && !stmt_could_throw_p (stmt)) + { tree lhs, rhs1, rhs2; enum tree_code rhs_code = gimple_assign_rhs_code (stmt);