From patchwork Tue Apr 26 22:06:09 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Easwaran Raman X-Patchwork-Id: 92964 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 6D9E5B6F00 for ; Wed, 27 Apr 2011 08:06:41 +1000 (EST) Received: (qmail 17861 invoked by alias); 26 Apr 2011 22:06:39 -0000 Received: (qmail 17848 invoked by uid 22791); 26 Apr 2011 22:06:37 -0000 X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, SPF_HELO_PASS, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from smtp-out.google.com (HELO smtp-out.google.com) (216.239.44.51) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 26 Apr 2011 22:06:17 +0000 Received: from hpaq3.eem.corp.google.com (hpaq3.eem.corp.google.com [172.25.149.3]) by smtp-out.google.com with ESMTP id p3QM6G97017549 for ; Tue, 26 Apr 2011 15:06:16 -0700 Received: from pvh11 (pvh11.prod.google.com [10.241.210.203]) by hpaq3.eem.corp.google.com with ESMTP id p3QM5cAq026244 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NOT) for ; Tue, 26 Apr 2011 15:06:10 -0700 Received: by pvh11 with SMTP id 11so890707pvh.8 for ; Tue, 26 Apr 2011 15:06:10 -0700 (PDT) MIME-Version: 1.0 Received: by 10.68.59.1 with SMTP id v1mr1496277pbq.291.1303855569891; Tue, 26 Apr 2011 15:06:09 -0700 (PDT) Received: by 10.142.230.18 with HTTP; Tue, 26 Apr 2011 15:06:09 -0700 (PDT) In-Reply-To: <4DB5A955.7030400@redhat.com> References: <4DB5A955.7030400@redhat.com> Date: Tue, 26 Apr 2011 15:06:09 -0700 Message-ID: Subject: Re: Improve DSE in the presence of calls From: Easwaran Raman To: Jeff Law Cc: gcc-patches@gcc.gnu.org, Richard Guenther X-System-Of-Record: true X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Mon, Apr 25, 2011 at 10:03 AM, Jeff Law wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 04/22/11 14:19, Easwaran Raman wrote: >> Hi, >>  This patch improves RTL DSE by not assuming that calls read all >> memory locations. With this patch, calls are assumed to read any >> non-frame memory and any stack variables that can potentially escape. >> This patch partly addresses PR rtl-optimization/44194. Bootstraps and >> no test regressions. OK for trunk? >> >> Thanks, >> Easwaran >> >> 2011-04-22  Easwaran Raman   >> >> PR rtl-optimization/44194 >>       * dse.c (header files): Include tree-flow.h. >>       (group_info): Add fields. >>       (globals): Add a new variable kill_on_calls. >>       (get_group_info): Initialize added fields. >>       (dse_step0): Initialize kill_on_calls. >>       (can_escape): New function. >>       (record_store): Pass EXPR corresponding to MEM to >>       set_usage_bits. >>       (dse_step2_nospill): Set kill_on_calls based on >>       group->escaped_*. >>       (scan_reads_nospill): Handle call instructions. >>       (find_insn_before_first_wild_read): Remove the function. >>       (dse_step3_scan): Remove call to find_insn_before_first_wild_read. >>       (dse_step5_nospill): Do not kill everything on call. >> >> testsuite/ChangeLog: >> >> 2011-04-22  Easwaran Raman   >> >> PR rtl-optimization/44194 >>       * gcc.dg/pr44194-1.c: New test. > On a bookkeeping note, it doesn't appear that you ever release the > bitmaps for gi->escaped_{p,n} or kill_on_calls.  This should probably be > done in dse_step7. I've now freed the bitmaps in dse_step7. > I'm going to assume that you and Richi have agreed upon the form of > can_escape. Yes, Richard suggested to use may_be_aliased for now to check for escaping. > AFAICT, even after your patch we still have a wild read recorded for > most CALL_INSNs (see this code in scan_insn which was not modified): > > > > >      else >        /* Every other call, including pure functions, may read memory.  */ >        add_wild_read (bb_info); > > It appears that you effectively ignore the wild_read in > scan_reads_nospill by special casing CALL_INSNs.  Correct?  Is so, does > it make still sense to call add_wild_read in scan_insn? > > > ISTM that wild reads are still possible due to a few other constructs > such as asms, address canonicalization failure, etc, so why is the > removal of the wild read support in dse_step5_nospill correct? You're right. The patch has correctness issues. It is not possible to simply not call add_wild_read because it also resets active_local_stores and frees read_recs. During the local phase it seems better to just treat calls as wild reads and reset active_local_stores and free read_recs. I've now refactored add_wild_read so that resetting active_local_stores and free read_recs are in separate methods that can be called on non-const/non-memset calls. In addition, I have added a non_frame_wild_read field in insn_info to mark non-const and non-memset calls. I've attached the revised patch. -Easwaran > Just to be clear, I generally like the change, but there's some details > that I think need to be better understood before installing. > > jeff > > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.11 (GNU/Linux) > Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ > > iQEcBAEBAgAGBQJNtalUAAoJEBRtltQi2kC7evMH/0UyasIBTcZM7MSeCDGvefdR > byd5/XaaH1FtdRS64ne1bGbjB/h96AJbXYfc3QECtKugfmwgSbJeaN/BD8T4WY/L > 2z1MkeSwcpswnxufjTR6G5WezuUsxagB4/xoSqk7NRfJdsM3eBR9n+ehTwlfsU5q > bTvQ/Uat0x027ddyrifD6vIOIIqFwMba9CvvN7vV0O+yzuxjzxNfe4IEcyMg2RIZ > j6xK+Bv7pge9pHC8ERKOFO17CPdK2JBe4ovtFL8s1sadBkjO2044uRszcl8E/9rj > MUGHExtFEIypzzi0wBqWyy3fz5QRCBuN/Xj6ptk4Cw7dkuBTZPWjySWrXV5kEs4= > =omzC > -----END PGP SIGNATURE----- > 2011-04-26 Easwaran Raman * dse.c: Include tree-flow.h (insn_info): Add new field non_frame_wild_read. (group_info): Add new fields escaped_n and escaped_p. (kill_on_calls): New variable. (get_group_info): Initialize gi->escaped_n and gi->escaped_p. (dse_step0): Initialize kill_on_calls. (can_escape): New function. (set_usage_bits): Add additional parameter; record information about escaped locations. (record_store): Pass EXPR corresponding to MEM to set_usage_bits. (dse_step2_nospill): Set kill_on_calls based on group->escaped_n and group->escaped_n. (add_wild_read): Refactor into... (reset_active_stores): ... New method, and (free_read_records): ... New method. (add_non_frame_wild_read): New method. (scan_insn): Call add_non_frame_wild_read on non-const calls. (scan_reads_nospill): Handle instructions with non_frame_wild_read. (dse_step5_nospill): Call scan_reads_nospill for instructions marked as non_frame_wild_read. (dse_step7): Free escaped_n, escaped_p and kill_on_calls bitmaps. testsuite/ChangeLog 2011-04-26 Easwaran Raman * gcc.dg/pr44194-1.c: New test. Index: gcc/testsuite/gcc.dg/pr44194-1.c =================================================================== --- gcc/testsuite/gcc.dg/pr44194-1.c (revision 0) +++ gcc/testsuite/gcc.dg/pr44194-1.c (revision 0) @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-rtl-dse1" } */ +#include + +struct twoints { uint64_t a, b; } foo(); +void bar(uint64_t a, uint64_t b); + +void func() { + struct twoints s = foo(); + bar(s.a, s.b); +} +/* { dg-final { scan-rtl-dump "global deletions = 2" "dse1" } } */ +/* { dg-final { cleanup-rtl-dump "dse1" } } */ Index: gcc/dse.c =================================================================== --- gcc/dse.c (revision 172844) +++ gcc/dse.c (working copy) @@ -48,6 +48,7 @@ along with GCC; see the file COPYING3. If not see #include "dbgcnt.h" #include "target.h" #include "params.h" +#include "tree-flow.h" /* This file contains three techniques for performing Dead Store Elimination (dse). @@ -326,6 +327,11 @@ struct insn_info contains a wild read, the use_rec will be null. */ bool wild_read; + /* This is true only for CALL instructions which could potentially read + any non-frame memory location. This field is used by the global + algorithm. */ + bool non_frame_wild_read; + /* This field is only used for the processing of const functions. These functions cannot read memory, but they can read the stack because that is where they may get their parms. We need to be @@ -501,6 +507,11 @@ struct group_info deleted. */ bitmap store1_n, store1_p, store2_n, store2_p; + /* These bitmaps keep track of offsets in this group escape this function. + An offset escapes if it corresponds to a named variable whose + addressable flag is set. */ + bitmap escaped_n, escaped_p; + /* The positions in this bitmap have the same assignments as the in, out, gen and kill bitmaps. This bitmap is all zeros except for the positions that are occupied by stores for this group. */ @@ -588,6 +599,9 @@ static int spill_deleted; static bitmap all_blocks; +/* Locations that are killed by calls in the global phase. */ +static bitmap kill_on_calls; + /* The number of bits used in the global bitmaps. */ static unsigned int current_position; @@ -692,6 +706,8 @@ get_group_info (rtx base) gi->store1_p = BITMAP_ALLOC (NULL); gi->store2_n = BITMAP_ALLOC (NULL); gi->store2_p = BITMAP_ALLOC (NULL); + gi->escaped_p = BITMAP_ALLOC (NULL); + gi->escaped_n = BITMAP_ALLOC (NULL); gi->group_kill = BITMAP_ALLOC (NULL); gi->process_globally = false; gi->offset_map_size_n = 0; @@ -714,6 +730,8 @@ get_group_info (rtx base) gi->store1_p = BITMAP_ALLOC (NULL); gi->store2_n = BITMAP_ALLOC (NULL); gi->store2_p = BITMAP_ALLOC (NULL); + gi->escaped_p = BITMAP_ALLOC (NULL); + gi->escaped_n = BITMAP_ALLOC (NULL); gi->group_kill = BITMAP_ALLOC (NULL); gi->process_globally = false; gi->frame_related = @@ -739,6 +757,7 @@ dse_step0 (void) spill_deleted = 0; scratch = BITMAP_ALLOC (NULL); + kill_on_calls = BITMAP_ALLOC (NULL); rtx_store_info_pool = create_alloc_pool ("rtx_store_info_pool", @@ -881,31 +900,48 @@ delete_dead_store_insn (insn_info_t insn_info) insn_info->wild_read = false; } +/* Check if EXPR can possibly escape the current function scope. */ +static bool +can_escape (tree expr) +{ + tree base; + if (!expr) + return true; + base = get_base_address (expr); + if (DECL_P (base) + && !may_be_aliased (base)) + return false; + return true; +} /* Set the store* bitmaps offset_map_size* fields in GROUP based on OFFSET and WIDTH. */ static void -set_usage_bits (group_info_t group, HOST_WIDE_INT offset, HOST_WIDE_INT width) +set_usage_bits (group_info_t group, HOST_WIDE_INT offset, HOST_WIDE_INT width, + tree expr) { HOST_WIDE_INT i; - + bool expr_escapes = can_escape (expr); if (offset > -MAX_OFFSET && offset + width < MAX_OFFSET) for (i=offset; istore1_n; store2 = group->store2_n; + escaped = group->escaped_n; ai = -i; } else { store1 = group->store1_p; store2 = group->store2_p; + escaped = group->escaped_p; ai = i; } @@ -924,18 +960,25 @@ static void group->offset_map_size_p = ai; } } + if (expr_escapes) + bitmap_set_bit (escaped, ai); } } +static void +reset_active_stores (void) +{ + active_local_stores = NULL; + active_local_stores_len = 0; +} -/* Set the BB_INFO so that the last insn is marked as a wild read. */ +/* Free all READ_REC of the LAST_INSN of BB_INFO. */ static void -add_wild_read (bb_info_t bb_info) +free_read_records (bb_info_t bb_info) { insn_info_t insn_info = bb_info->last_insn; read_info_t *ptr = &insn_info->read_rec; - while (*ptr) { read_info_t next = (*ptr)->next; @@ -943,16 +986,35 @@ static void { pool_free (read_info_pool, *ptr); *ptr = next; - } + } else - ptr = &(*ptr)->next; + ptr = &(*ptr)->next; } +} + +/* Set the BB_INFO so that the last insn is marked as a wild read. */ + +static void +add_wild_read (bb_info_t bb_info) +{ + insn_info_t insn_info = bb_info->last_insn; insn_info->wild_read = true; - active_local_stores = NULL; - active_local_stores_len = 0; + free_read_records (bb_info); + reset_active_stores (); } +/* Set the BB_INFO so that the last insn is marked as a wild read of + non-frame locations. */ +static void +add_non_frame_wild_read (bb_info_t bb_info) +{ + insn_info_t insn_info = bb_info->last_insn; + insn_info->non_frame_wild_read = true; + free_read_records (bb_info); + reset_active_stores (); +} + /* Return true if X is a constant or one of the registers that behave as a constant over the life of a function. This is equivalent to !rtx_varies_p for memory addresses. */ @@ -1355,9 +1417,10 @@ record_store (rtx body, bb_info_t bb_info) group_info_t group = VEC_index (group_info_t, rtx_group_vec, group_id); + tree expr = MEM_EXPR (mem); store_info = (store_info_t) pool_alloc (rtx_store_info_pool); - set_usage_bits (group, offset, width); + set_usage_bits (group, offset, width, expr); if (dump_file) fprintf (dump_file, " processing const base store gid=%d[%d..%d)\n", @@ -2474,8 +2537,9 @@ scan_insn (bb_info_t bb_info, rtx insn) } else - /* Every other call, including pure functions, may read memory. */ - add_wild_read (bb_info); + /* Every other call, including pure functions, may read any memory + that is not relative to the frame. */ + add_non_frame_wild_read (bb_info); return; } @@ -2788,7 +2852,6 @@ dse_step2_nospill (void) /* Position 0 is unused because 0 is used in the maps to mean unused. */ current_position = 1; - FOR_EACH_VEC_ELT (group_info_t, rtx_group_vec, i, group) { bitmap_iterator bi; @@ -2804,12 +2867,16 @@ dse_step2_nospill (void) EXECUTE_IF_SET_IN_BITMAP (group->store2_n, 0, j, bi) { bitmap_set_bit (group->group_kill, current_position); + if (bitmap_bit_p (group->escaped_n, j)) + bitmap_set_bit (kill_on_calls, current_position); group->offset_map_n[j] = current_position++; group->process_globally = true; } EXECUTE_IF_SET_IN_BITMAP (group->store2_p, 0, j, bi) { bitmap_set_bit (group->group_kill, current_position); + if (bitmap_bit_p (group->escaped_p, j)) + bitmap_set_bit (kill_on_calls, current_position); group->offset_map_p[j] = current_position++; group->process_globally = true; } @@ -3040,7 +3107,21 @@ scan_reads_nospill (insn_info_t insn_info, bitmap bitmap_and_compl_into (gen, group->group_kill); } } - + if (insn_info->non_frame_wild_read) + { + /* Kill all non-frame related stores. Kill all stores of variables that + escape. */ + if (kill) + bitmap_ior_into (kill, kill_on_calls); + bitmap_and_compl_into (gen, kill_on_calls); + FOR_EACH_VEC_ELT (group_info_t, rtx_group_vec, i, group) + if (group->process_globally && !group->frame_related) + { + if (kill) + bitmap_ior_into (kill, group->group_kill); + bitmap_and_compl_into (gen, group->group_kill); + } + } while (read_info) { FOR_EACH_VEC_ELT (group_info_t, rtx_group_vec, i, group) @@ -3564,10 +3645,13 @@ dse_step5_nospill (void) fprintf (dump_file, "wild read\n"); bitmap_clear (v); } - else if (insn_info->read_rec) + else if (insn_info->read_rec + || insn_info->non_frame_wild_read) { - if (dump_file) + if (dump_file && !insn_info->non_frame_wild_read) fprintf (dump_file, "regular read\n"); + else if (dump_file) + fprintf (dump_file, "non-frame wild read\n"); scan_reads_nospill (insn_info, v, NULL); } } @@ -3716,6 +3800,8 @@ dse_step7 (bool global_done) BITMAP_FREE (group->store1_p); BITMAP_FREE (group->store2_n); BITMAP_FREE (group->store2_p); + BITMAP_FREE (group->escaped_n); + BITMAP_FREE (group->escaped_p); BITMAP_FREE (group->group_kill); } @@ -3746,6 +3832,7 @@ dse_step7 (bool global_done) VEC_free (group_info_t, heap, rtx_group_vec); BITMAP_FREE (all_blocks); BITMAP_FREE (scratch); + BITMAP_FREE (kill_on_calls); free_alloc_pool (rtx_store_info_pool); free_alloc_pool (read_info_pool);