From patchwork Fri Jun 14 08:49:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Hubicka X-Patchwork-Id: 1115828 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-502949-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=ucw.cz Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45QDnr3LdHz9s6w for ; Fri, 14 Jun 2019 18:50:06 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type :content-transfer-encoding; q=dns; s=default; b=vE/OOyl63HEs4G09 fiWGkt8ISVDvkcil2SycScT+wWS1upbopDa3RVaT/33tUH7wYhAclqNRaP50jYwb Hav65hCuL87/wW7nwIG0csj1JP3O+4DDT3IeZzxqzBXGTG3RaZ29ZC3mY//2f9xE yUaPRTndtNci4/TlJKpbx6Tn+RM= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type :content-transfer-encoding; s=default; bh=P/QgTnedB32KBPWLht2Z8Z JrGWA=; b=mHXSXO2zarqjCK6tCqNz+Hlj/qs1ra93yG9RXzeR3i/NwcQqn7VN9a zL5ld4TduBnTnvhL+CGMqlnNvH7csT+x80gyt5DNmeBHShfwKf6Wp5k6BN27I5O9 m12DRFSG0PvWFFjlI+kau7Kfx27k6j0/SRA4xdtfvU0xkELzUxah8= Received: (qmail 24575 invoked by alias); 14 Jun 2019 08:49:56 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 24558 invoked by uid 89); 14 Jun 2019 08:49:56 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-10.4 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3 autolearn=ham version=3.3.1 spammy=wy, claiming, ry, Bases X-HELO: nikam.ms.mff.cuni.cz Received: from nikam.ms.mff.cuni.cz (HELO nikam.ms.mff.cuni.cz) (195.113.20.16) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 14 Jun 2019 08:49:53 +0000 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id D982328251C; Fri, 14 Jun 2019 10:49:50 +0200 (CEST) Date: Fri, 14 Jun 2019 10:49:50 +0200 From: Jan Hubicka To: gcc-patches@gcc.gnu.org, rguenther@suse.de, d@dcepelik.cz Subject: Remove nonoverlapping_component_refs_of_decl_p Message-ID: <20190614084950.j4vmohkzuo3ql4aa@kam.mff.cuni.cz> MIME-Version: 1.0 Content-Disposition: inline User-Agent: NeoMutt/20170113 (1.7.2) Hi, this patch removes nonoverlapping_component_refs_of_decl_p and replaces it by nonoverlapping_component_refs_p. Those functions are basically equivalent, however the first assumes that the outer type of memory access is type of decl which is not true in the gimple memory model. nonoverlapping_component_refs_p may wind up more expensive by doing quick sort. If the outer types match, we could use the same walk as in nonoverlapping_component_refs_of_decl_p (and I can easilly implement it in there saing some qsort for the indirect oracles too) but since nonoverlapping_component_refs_p already handles common case of shallow access paths and does not show in profiles I am not convinced it is worth the code duplication. I think the oracle is mostly redudnant with aliasing_component_refs - in fact I have troubles constructing testcases it would catch that the second would not except for expliting the -1 corner cases of same_types_for_tbaa, so I decided to leave this for later. The only case where this function is stronger is the case of match of types in the middle of access path. This require some non-const indexed array refs to exploit it. Perhaps we could integrate both tests and do nonoverlapping_component_refs_p only if one of many handled components walks we do earlier concludes that it has chance to suceed. Point of this patch is however to fix the code duplication and also add missing checks for view converts - I don't see how it would be safe to use the access paths here when it is not in the other two oracles. I implemented assert checking that whenever nonoverlapping_component_refs_of_decl_p so does nonoverlapping_component_refs_p and found two issues: 1) the first does not give up seeing handled component that is not COMPONENT_REF while other does. This prevents it to disambiguate for example foo.bar.array[1]; from foo.baz.array[1]; where bar and baz are fields of structure foo. This is valid 2) It gives up upon seeing bitfield while it may disambiguate later in the patch like. foo.bar.bitfield1; from foo.baz.bitfield2; Here we can not tell that bitfield1 is different from bitfield2 (at least we do not do so currently claiming that it is not different for RTL, but I think in that case we should not see bitfield in the access path) but we can still disambiguate based on bar/baz. Patch changes Alias oracle query stats: refs_may_alias_p: 3021539 disambiguations, 3321136 queries ref_maybe_used_by_call_p: 7118 disambiguations, 3047133 queries call_may_clobber_ref_p: 817 disambiguations, 817 queries aliasing_component_ref_p: 2050 disambiguations, 19908 queries TBAA oracle: 1419961 disambiguations 2916692 queries 555158 are in alias set 0 575103 queries asked about the same object 0 queries asked about the same alias set 0 access volatile 252874 are dependent in the DAG 113596 are aritificially in conflict with void * to Alias oracle query stats: refs_may_alias_p: 3021521 disambiguations, 3321121 queries ref_maybe_used_by_call_p: 7118 disambiguations, 3047115 queries call_may_clobber_ref_p: 817 disambiguations, 817 queries nonoverlapping_component_refs_p: 25 disambiguations, 83528 queries aliasing_component_refs_p: 2050 disambiguations, 19908 queries TBAA oracle: 1419961 disambiguations 2916690 queries 555158 are in alias set 0 575103 queries asked about the same object 0 queries asked about the same alias set 0 access volatile 252872 are dependent in the DAG 113596 are aritificially in conflict with void * So at least for tramp3d nonoverlapping_component_refs_p doesn't do any miracles. There are fewer disambiguations caused by the new view convert verification. It however also causes: FAIL: gcc.dg/vect/pr66142.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 1 loops in function" 1 FAIL: gcc.dg/vect/pr66142.c scan-tree-dump-times vect "vectorized 1 loops in function" 1 Test does: struct A { float x, y; }; struct B { struct A t, w; }; static inline float bar (const struct B *x) { struct A r; float b, c, d; r.x = x->t.x; r.y = x->t.y; which gets inlined to void foo (float *a, float *b, float *c) { int i; float z = 0.0f; float u = *a; #pragma omp simd for (i = 0; i < 32; i++) { float x = b[i]; float y = c[i]; struct B r; r.t.x = 1.0f; r.t.y = u; r.w.x = x; r.w.y = y; z += bar (&r); } *a = z; } SIMD code turns struct B r into array of size 32. The first difference is fre1. Here we give up on the oracle because of: @@ -488,15 +505,10 @@ D.2200[_20].t.y = u_13; D.2200[_20].w.x = x_22; D.2200[_20].w.y = y_24; - _30 = MEM [(const struct B *)&D.2200][_20].t.x; - _34 = MEM [(const struct B *)&D.2200][_20].t.y; - _35 = MEM [(const struct B *)&D.2200][_20].w.x; - _36 = _30 * _35; - _38 = y_24 * _34; - b_39 = _36 + _38; - _40 = _30 * _30; - _41 = b_39 + _40; - _42 = _34 * _34; + _38 = u_13 * y_24; + b_39 = x_22 + _38; + _41 = b_39 + 1.0e+0; + _42 = u_13 * u_13; It is truly silly to not optimize this, but there is a type mismatch between "struct B[32]" and "struct B" which we perceive as a view convert (same_type_for_tbaa_p return false which is bit ridiculout because precisely this case could be supported). (We would miss this if struct B was dynamically allocated previously too.) However in general we could run into this scenario since the type mismatch is a result of forwprop1 handling D.2200[_20].t.x = 1.0e+0; D.2200[_20].t.y = u_13; D.2200[_20].w.x = x_22; D.2200[_20].w.y = y_24; _29 = &D.2200[_20]; _30 = MEM[(const struct B *)_29].t.x; _34 = MEM[(const struct B *)_29].t.y; _35 = MEM[(const struct B *)_29].w.x; _36 = _30 * _35; _37 = MEM[(const struct B *)_29].w.y; So if there was outer struct, then the view convert would indeed happen. Any ideas how to solve this? I guess one option would be to delay such lossy forward subtitutions for one of later forwprops/PRE since for most of SSA optimizers the earlier sequence should be transparent enough and there is ahope that one can put constant in. Shall I XFAIL the testcase? Bootstrapped/regtested x86_64-linux, OK? Honza * tree-ssa-alias.c (alias_stats): Add nonoverlapping_component_refs_p_may_alias and nonoverlapping_component_refs_p_no_alias. (dump_alias_stats): Dump it. (nonoverlapping_component_refs_of_decl_p): Remove. (nonoverlapping_component_refs_p): Walk all handled components; do not give up on diambiguation on first failed match. (decl_refs_may_alias_p): Watch for view converts; use nonoverlapping_component_refs_of_decl_p. Index: tree-ssa-alias.c =================================================================== --- tree-ssa-alias.c (revision 272283) +++ tree-ssa-alias.c (working copy) @@ -100,6 +100,8 @@ static struct { unsigned HOST_WIDE_INT call_may_clobber_ref_p_no_alias; unsigned HOST_WIDE_INT aliasing_component_refs_p_may_alias; unsigned HOST_WIDE_INT aliasing_component_refs_p_no_alias; + unsigned HOST_WIDE_INT nonoverlapping_component_refs_p_may_alias; + unsigned HOST_WIDE_INT nonoverlapping_component_refs_p_no_alias; } alias_stats; void @@ -124,7 +126,13 @@ dump_alias_stats (FILE *s) alias_stats.call_may_clobber_ref_p_no_alias, alias_stats.call_may_clobber_ref_p_no_alias + alias_stats.call_may_clobber_ref_p_may_alias); - fprintf (s, " aliasing_component_ref_p: " + fprintf (s, " nonoverlapping_component_refs_p: " + HOST_WIDE_INT_PRINT_DEC" disambiguations, " + HOST_WIDE_INT_PRINT_DEC" queries\n", + alias_stats.nonoverlapping_component_refs_p_no_alias, + alias_stats.nonoverlapping_component_refs_p_no_alias + + alias_stats.nonoverlapping_component_refs_p_may_alias); + fprintf (s, " aliasing_component_refs_p: " HOST_WIDE_INT_PRINT_DEC" disambiguations, " HOST_WIDE_INT_PRINT_DEC" queries\n", alias_stats.aliasing_component_refs_p_no_alias, @@ -1029,106 +1037,6 @@ aliasing_component_refs_p (tree ref1, return false; } -/* Return true if we can determine that component references REF1 and REF2, - that are within a common DECL, cannot overlap. */ - -static bool -nonoverlapping_component_refs_of_decl_p (tree ref1, tree ref2) -{ - auto_vec component_refs1; - auto_vec component_refs2; - - /* Create the stack of handled components for REF1. */ - while (handled_component_p (ref1)) - { - component_refs1.safe_push (ref1); - ref1 = TREE_OPERAND (ref1, 0); - } - if (TREE_CODE (ref1) == MEM_REF) - { - if (!integer_zerop (TREE_OPERAND (ref1, 1))) - return false; - ref1 = TREE_OPERAND (TREE_OPERAND (ref1, 0), 0); - } - - /* Create the stack of handled components for REF2. */ - while (handled_component_p (ref2)) - { - component_refs2.safe_push (ref2); - ref2 = TREE_OPERAND (ref2, 0); - } - if (TREE_CODE (ref2) == MEM_REF) - { - if (!integer_zerop (TREE_OPERAND (ref2, 1))) - return false; - ref2 = TREE_OPERAND (TREE_OPERAND (ref2, 0), 0); - } - - /* Bases must be either same or uncomparable. */ - gcc_checking_assert (ref1 == ref2 - || (DECL_P (ref1) && DECL_P (ref2) - && compare_base_decls (ref1, ref2) != 0)); - - /* Pop the stacks in parallel and examine the COMPONENT_REFs of the same - rank. This is sufficient because we start from the same DECL and you - cannot reference several fields at a time with COMPONENT_REFs (unlike - with ARRAY_RANGE_REFs for arrays) so you always need the same number - of them to access a sub-component, unless you're in a union, in which - case the return value will precisely be false. */ - while (true) - { - do - { - if (component_refs1.is_empty ()) - return false; - ref1 = component_refs1.pop (); - } - while (!RECORD_OR_UNION_TYPE_P (TREE_TYPE (TREE_OPERAND (ref1, 0)))); - - do - { - if (component_refs2.is_empty ()) - return false; - ref2 = component_refs2.pop (); - } - while (!RECORD_OR_UNION_TYPE_P (TREE_TYPE (TREE_OPERAND (ref2, 0)))); - - /* Beware of BIT_FIELD_REF. */ - if (TREE_CODE (ref1) != COMPONENT_REF - || TREE_CODE (ref2) != COMPONENT_REF) - return false; - - tree field1 = TREE_OPERAND (ref1, 1); - tree field2 = TREE_OPERAND (ref2, 1); - - /* ??? We cannot simply use the type of operand #0 of the refs here - as the Fortran compiler smuggles type punning into COMPONENT_REFs - for common blocks instead of using unions like everyone else. */ - tree type1 = DECL_CONTEXT (field1); - tree type2 = DECL_CONTEXT (field2); - - /* We cannot disambiguate fields in a union or qualified union. */ - if (type1 != type2 || TREE_CODE (type1) != RECORD_TYPE) - return false; - - if (field1 != field2) - { - /* A field and its representative need to be considered the - same. */ - if (DECL_BIT_FIELD_REPRESENTATIVE (field1) == field2 - || DECL_BIT_FIELD_REPRESENTATIVE (field2) == field1) - return false; - /* Different fields of the same record type cannot overlap. - ??? Bitfields can overlap at RTL level so punt on them. */ - if (DECL_BIT_FIELD (field1) && DECL_BIT_FIELD (field2)) - return false; - return true; - } - } - - return false; -} - /* qsort compare function to sort FIELD_DECLs after their DECL_FIELD_CONTEXT TYPE_UID. */ @@ -1154,40 +1062,66 @@ nonoverlapping_component_refs_p (const_t { if (!flag_strict_aliasing || !x || !y - || TREE_CODE (x) != COMPONENT_REF - || TREE_CODE (y) != COMPONENT_REF) - return false; + || !handled_component_p (x) + || !handled_component_p (y)) + { + ++alias_stats.nonoverlapping_component_refs_p_may_alias; + return false; + } auto_vec fieldsx; - while (TREE_CODE (x) == COMPONENT_REF) + while (handled_component_p (x)) { - tree field = TREE_OPERAND (x, 1); - tree type = DECL_FIELD_CONTEXT (field); - if (TREE_CODE (type) == RECORD_TYPE) - fieldsx.safe_push (field); + if (TREE_CODE (x) == COMPONENT_REF) + { + tree field = TREE_OPERAND (x, 1); + tree type = DECL_FIELD_CONTEXT (field); + if (TREE_CODE (type) == RECORD_TYPE) + fieldsx.safe_push (field); + } x = TREE_OPERAND (x, 0); } if (fieldsx.length () == 0) - return false; + { + ++alias_stats.nonoverlapping_component_refs_p_may_alias; + return false; + } auto_vec fieldsy; - while (TREE_CODE (y) == COMPONENT_REF) + while (handled_component_p (y)) { - tree field = TREE_OPERAND (y, 1); - tree type = DECL_FIELD_CONTEXT (field); - if (TREE_CODE (type) == RECORD_TYPE) - fieldsy.safe_push (TREE_OPERAND (y, 1)); + if (TREE_CODE (y) == COMPONENT_REF) + { + tree field = TREE_OPERAND (y, 1); + tree type = DECL_FIELD_CONTEXT (field); + if (TREE_CODE (type) == RECORD_TYPE) + fieldsy.safe_push (TREE_OPERAND (y, 1)); + } y = TREE_OPERAND (y, 0); } if (fieldsy.length () == 0) - return false; + { + ++alias_stats.nonoverlapping_component_refs_p_may_alias; + return false; + } /* Most common case first. */ if (fieldsx.length () == 1 && fieldsy.length () == 1) - return ((DECL_FIELD_CONTEXT (fieldsx[0]) - == DECL_FIELD_CONTEXT (fieldsy[0])) - && fieldsx[0] != fieldsy[0] - && !(DECL_BIT_FIELD (fieldsx[0]) && DECL_BIT_FIELD (fieldsy[0]))); + { + if ((DECL_FIELD_CONTEXT (fieldsx[0]) + == DECL_FIELD_CONTEXT (fieldsy[0])) + && fieldsx[0] != fieldsy[0] + && !(DECL_BIT_FIELD (fieldsx[0]) && DECL_BIT_FIELD (fieldsy[0]))) + { + ++alias_stats.nonoverlapping_component_refs_p_no_alias; + return true; + } + else + { + ++alias_stats.nonoverlapping_component_refs_p_may_alias; + return false; + } + } if (fieldsx.length () == 2) { @@ -1215,19 +1149,26 @@ nonoverlapping_component_refs_p (const_t if (typex == typey) { /* We're left with accessing different fields of a structure, - no possible overlap. */ + no possible overlap. + If we can not disambiguate, do not give up and try to continue: + it is possible that there are multiple matching types on the + access patch and each of them may result in disambiguation. */ if (fieldx != fieldy) { /* A field and its representative need to be considered the same. */ if (DECL_BIT_FIELD_REPRESENTATIVE (fieldx) == fieldy || DECL_BIT_FIELD_REPRESENTATIVE (fieldy) == fieldx) - return false; + ; /* Different fields of the same record type cannot overlap. ??? Bitfields can overlap at RTL level so punt on them. */ - if (DECL_BIT_FIELD (fieldx) && DECL_BIT_FIELD (fieldy)) - return false; - return true; + else if (DECL_BIT_FIELD (fieldx) && DECL_BIT_FIELD (fieldy)) + ; + else + { + ++alias_stats.nonoverlapping_component_refs_p_no_alias; + return true; + } } } if (TYPE_UID (typex) < TYPE_UID (typey)) @@ -1245,10 +1186,11 @@ nonoverlapping_component_refs_p (const_t } while (1); + ++alias_stats.nonoverlapping_component_refs_p_may_alias; return false; } /* Return true if two memory references based on the variables BASE1 and BASE2 constrained to [OFFSET1, OFFSET1 + MAX_SIZE1) and [OFFSET2, OFFSET2 + MAX_SIZE2) may alias. REF1 and REF2 @@ -1271,11 +1212,41 @@ decl_refs_may_alias_p (tree ref1, tree b if (!ranges_maybe_overlap_p (offset1, max_size1, offset2, max_size2)) return false; + /* Access path oracle below needs to know both references. */ + if (!ref1 || !ref2) + return true; + + /* If the decl is accessed via a MEM_REF, reconstruct the base + we can use for TBAA. */ + tree dbase1 = ref1; + + while (handled_component_p (dbase1)) + dbase1 = TREE_OPERAND (dbase1, 0); + if (TREE_CODE (dbase1) == MEM_REF + || TREE_CODE (dbase1) == TARGET_MEM_REF) + { + tree ptrtype1 = TREE_TYPE (TREE_OPERAND (dbase1, 1)); + /* If first reference is view-converted, give up now. */ + if (same_type_for_tbaa (TREE_TYPE (dbase1), TREE_TYPE (ptrtype1)) != 1) + return true; + } + + tree dbase2 = ref2; + + while (handled_component_p (dbase2)) + dbase2 = TREE_OPERAND (dbase2, 0); + if (TREE_CODE (dbase2) == MEM_REF + || TREE_CODE (dbase2) == TARGET_MEM_REF) + { + tree ptrtype2 = TREE_TYPE (TREE_OPERAND (dbase2, 1)); + /* If second reference is view-converted, give up now. */ + if (same_type_for_tbaa (TREE_TYPE (dbase2), TREE_TYPE (ptrtype2)) != 1) + return true; + } + /* For components with variable position, the above test isn't sufficient, so we disambiguate component references manually. */ - if (ref1 && ref2 - && handled_component_p (ref1) && handled_component_p (ref2) - && nonoverlapping_component_refs_of_decl_p (ref1, ref2)) + if (nonoverlapping_component_refs_p (ref1, ref2)) return false; return true;