From patchwork Sat Aug 17 19:35:17 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Hubicka X-Patchwork-Id: 267996 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified)) by ozlabs.org (Postfix) with ESMTPS id 9817C2C00D9 for ; Sun, 18 Aug 2013 05:35:30 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; q=dns; s= default; b=qVApXEwrZGPWATzw80c7IIjtMKFtDvcU9dR0/U4PruX0EjPx9DSz6 IK1f8A2dlkDb1dKyBzvMFEKxfmkt0JJTLP7Pyve8WR2XaftzYfLucEkStwHn9KOZ tBTpst5Aa6JPzUFMyumbvP1EoYRGttqcEuOEDSf4BX2Snms7iwz5Zo= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; s= default; bh=Wb8ETLAv55XNKPZhGdK3RdHOvV0=; b=PT4NdlUwPfIuxl+D9LmN ZhOIc0crg3dgID5S3VTY3y3USRWHboXUC2lUcBFnRXteH5+Ffg+4mmIo3DM4zzpZ z9JTRo2Q+Aqn7c6Gyiq1hJExZF1DzMIxtLX6xytTb0Cj4/Z2VOlwJmehLIcxUvZo vDAxwtq6t88nR6ovNs7PFFs= Received: (qmail 8238 invoked by alias); 17 Aug 2013 19:35:23 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 8229 invoked by uid 89); 17 Aug 2013 19:35:23 -0000 X-Spam-SWARE-Status: No, score=-4.0 required=5.0 tests=AWL, BAYES_50, KHOP_RCVD_UNTRUST, RCVD_IN_DNSWL_LOW, RCVD_IN_HOSTKARMA_NO, RCVD_IN_HOSTKARMA_W, RCVD_IN_HOSTKARMA_WL, RP_MATCHES_RCVD autolearn=ham version=3.3.2 Received: from nikam.ms.mff.cuni.cz (HELO nikam.ms.mff.cuni.cz) (195.113.20.16) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Sat, 17 Aug 2013 19:35:20 +0000 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 94ECA54395B; Sat, 17 Aug 2013 21:35:17 +0200 (CEST) Date: Sat, 17 Aug 2013 21:35:17 +0200 From: Jan Hubicka To: gcc-patches@gcc.gnu.org, marxin.liska@gmail.com, rguenther@suse.de, jason@redhat.com, tglek@mozilla.com, mjambor@suse.cz, davidxl@google.com Subject: Type inheritance graph analysis & speculative devirtualization, part 1/6 Message-ID: <20130817193517.GA4750@kam.mff.cuni.cz> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-06-14) Hi, I have got the ipa-devirt implementation into shape it seems actually useful and working as expected. The main selling point of it for the moment is compile time of C++ programs (especially with LTO) and verification facilities. Currently we keep all virtual functions in the callgraph until after inlining in hope that the type based devirtualization will find use of them. This does not happen in wast majorty of cases (currently about 400 times for Firefox). Removing them early, at compile time, reduces tons of streaming and thus linktime speed and memory use. For firefox the callgraph node count goes down from over 3 million functions, to hundred of thousdands. Martin Liska did statistic for me with a recent Firefox tree (rev 201708). Without the patch the overall build time is 33m real, 180m user. LTO link time is 8m real, 26m user. Peaking over 15GB of memory use (combined GCC+linker), With the patch it goes down to 26m real, 149m user and the linking is 6m real, 24m user. Memory use is peaking at 4GB for WPA, parallel part runs to 9GB, but that can be controlled by an parallelism (plus I have independent patch to cut it to half). Memory use graphs can be seen at http://kam.mff.cuni.cz/~hubicka/build1.pdf http://kam.mff.cuni.cz/~hubicka/build2.pdf I think we thus arrived to shape LTO is practically useful for firefox. I still hope to cut WPA in approx half in 4.9 timeframe and we need to address issues with command line options merging at least. Some of the improvements are seen w/o LTO, too. We save running early opts on all those functions, but I did not made any detailed stats. Since the whole area of devirtualization in middle end seems to have very little documentation and number of latent bugs and moreover I am by no means expert on C++ type hiearchy a virtual calls, I decided to break up the change into small patches and try to write an detailed description for each, so hopefully it will be easier to eyeball the changes and hopefully identify issues + get things more sane. The plan is as follows part 1: tracking of all polymorphic calls in indirect_info (until now we track part 2: basic ipa-devirt type inheritance graph construction code, debugging dumps, function to get possible targets of given virtual function call and its use at callgraph construction time. This is the part bringing important improvements on compile time, especially with firefox LTO because we get rid of unreachable function calls comming from headers quickly without having them to bubble through early optimization, LTO streaming and merging. part 4: verifier that all devirtualization decisions are in the list of possible targets + user warning when this is not true for virtual table access folding machinery. Ignoring GCC bugs it can happen only in programs with undefined effect. Falss warnings are useful to detect changes where type inheritance graph is broken. part 3: support for One Definition Rule matching at LTO time. This needs fix to types_same_for_odr I posted to the original thread. It also adds logic to match multiple different tree types together, detect some ODR violations, eliminate external vtables when internal are around and other bit subtle details. part 4: Use of ipa-devirt in unreachable function removal (that is done at LTO time, too) part 5: Do devirtualization for types in local namespaces where there is only one candidate. part 6: Do speculative devirtualization when there is only one candidate. This will handle Firefox's addref. All these changes are conceptually very easy. I found number of weird things and little improvements in the current devirt code I will send independently. I am also trying to keep code flexible so it allows further development: such as tracking of multiple type variants with same targets thorough ipa-cp and folding and such. The dynamic type changes seems somewhat scary so I want to keep code simple at start to be sure it won't produce wrong code. (in the plan above only part 5 risks bad code). This is part 1 of the series. Nothing really fancy is going on here. It adds code to cgraph.c to notice all polymorphic calls and record the info. I also added virtual_method_call_p that disables the devirtualization code path from trying to do something on the OBJ-C OBJ_TYPE_REF use because it can not work as currently written. If we remove it, we can use test for OBJ_TYPE_REF again instead. Boostrapped/regtested x86_64-linux, testing ppc64-linux and will commit it if passes and there are no objections. Honza * cgraph.c (cgraph_create_indirect_edge): Discover polymorphic calls and record basic info into indirect_info. * gimple-fold.c (gimple_fold_call): When doing BINFO based devirtualization, ignore objc function calls. * ipa-cp.c (initialize_node_lattices): Be ready for polymorphic call with no parm index info. * ipa-prop.c (ipa_analyze_call_uses): Likewise. * tree.c (virtual_method_call_p): New function. * tree.h (virtual_method_call_p): Declare. Index: cgraph.c =================================================================== --- cgraph.c (revision 201814) +++ cgraph.c (working copy) @@ -925,6 +925,7 @@ cgraph_create_indirect_edge (struct cgra { struct cgraph_edge *edge = cgraph_create_edge_1 (caller, NULL, call_stmt, count, freq); + tree target; edge->indirect_unknown_callee = 1; initialize_inline_failed (edge); @@ -932,6 +933,23 @@ cgraph_create_indirect_edge (struct cgra edge->indirect_info = cgraph_allocate_init_indirect_info (); edge->indirect_info->ecf_flags = ecf_flags; + /* Record polymorphic call info. */ + if (call_stmt + && (target = gimple_call_fn (call_stmt)) + && virtual_method_call_p (target)) + { + tree type = obj_type_ref_class (target); + + + /* Only record types can have virtual calls. */ + gcc_assert (TREE_CODE (type) == RECORD_TYPE); + edge->indirect_info->param_index = -1; + edge->indirect_info->otr_token + = tree_low_cst (OBJ_TYPE_REF_TOKEN (target), 1); + edge->indirect_info->otr_type = type; + edge->indirect_info->polymorphic = 1; + } + edge->next_callee = caller->indirect_calls; if (caller->indirect_calls) caller->indirect_calls->prev_callee = edge; Index: gimple-fold.c =================================================================== --- gimple-fold.c (revision 201814) +++ gimple-fold.c (working copy) @@ -1105,7 +1105,7 @@ gimple_fold_call (gimple_stmt_iterator * gimple_call_set_fn (stmt, OBJ_TYPE_REF_EXPR (callee)); changed = true; } - else + else if (virtual_method_call_p (callee)) { tree obj = OBJ_TYPE_REF_OBJECT (callee); tree binfo = gimple_extract_devirt_binfo_from_cst Index: ipa-cp.c =================================================================== --- ipa-cp.c (revision 201814) +++ ipa-cp.c (working copy) @@ -734,7 +734,8 @@ initialize_node_lattices (struct cgraph_ } for (ie = node->indirect_calls; ie; ie = ie->next_callee) - if (ie->indirect_info->polymorphic) + if (ie->indirect_info->polymorphic + && ie->indirect_info->param_index >= 0) { gcc_checking_assert (ie->indirect_info->param_index >= 0); ipa_get_parm_lattices (info, Index: ipa-prop.c =================================================================== --- ipa-prop.c (revision 201814) +++ ipa-prop.c (working copy) @@ -1922,7 +1922,7 @@ ipa_analyze_call_uses (struct cgraph_nod return; if (TREE_CODE (target) == SSA_NAME) ipa_analyze_indirect_call_uses (node, info, parms_ainfo, call, target); - else if (TREE_CODE (target) == OBJ_TYPE_REF) + else if (virtual_method_call_p (target)) ipa_analyze_virtual_call_uses (node, info, call, target); } Index: tree.c =================================================================== --- tree.c (revision 201817) +++ tree.c (working copy) @@ -11864,6 +11864,27 @@ types_same_for_odr (tree type1, tree typ return true; } +/* TARGET is a call target of GIMPLE call statement + (obtained by gimple_call_fn). Return true if it is + OBJ_TYPE_REF representing an virtual call of C++ method. + (As opposed to OBJ_TYPE_REF representing objc calls + through a cast where middle-end devirtualization machinery + can't apply.) */ + +bool +virtual_method_call_p (tree target) +{ + if (TREE_CODE (target) != OBJ_TYPE_REF) + return false; + target = TREE_TYPE (target); + gcc_checking_assert (TREE_CODE (target) == POINTER_TYPE); + target = TREE_TYPE (target); + if (TREE_CODE (target) == FUNCTION_TYPE) + return false; + gcc_checking_assert (TREE_CODE (target) == METHOD_TYPE); + return true; +} + /* REF is OBJ_TYPE_REF, return the class the ref corresponds to. */ tree Index: tree.h =================================================================== --- tree.h (revision 201814) +++ tree.h (working copy) @@ -5974,6 +5974,7 @@ extern location_t tree_nonartificial_loc extern tree block_ultimate_origin (const_tree); extern tree get_binfo_at_offset (tree, HOST_WIDE_INT, tree); +extern bool virtual_method_call_p (tree); extern tree obj_type_ref_class (tree ref); extern bool types_same_for_odr (tree type1, tree type2); extern tree get_ref_base_and_extent (tree, HOST_WIDE_INT *,