From patchwork Mon Jan 30 17:19:14 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aldy Hernandez X-Patchwork-Id: 721639 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3vBx2p2NfHz9t2b for ; Tue, 31 Jan 2017 04:19:30 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="YMCL9TrU"; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:cc:from:message-id:date:mime-version :in-reply-to:content-type; q=dns; s=default; b=DPAMN/jTTvnsJuU3B lGlj43L0Nq8Pi+FsY0hrhETzioXl0glFpw4YS+KYDME2CLe+W8ghU7CvxW7zbVaB mYONHGrIOce/Oji5726tldWVKtrL6T5jkDwiaJHR8ZSKTN+i0f2l00H2oss1aoJJ 9ho0tNZudRZ1OG5KcUpSXFP7hU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:cc:from:message-id:date:mime-version :in-reply-to:content-type; s=default; bh=Rmobv5+faFtP08pRawo9nQY ChQ0=; b=YMCL9TrUNz9ajQlm8vnMzdFBKKhjrfrJlGyzdukOJlPzmb3iqmRFUyz QHTISf2lxwWSQq/vWS1Ens53ZXZPDjVTtinp5aZlCaQvw1sArYwSq7YR5dcD+EUH n0TJPOJwbvO5uB2VUm4EZZvMBIMg3EAZjor4gQvunOprmYLuNwws= Received: (qmail 22274 invoked by alias); 30 Jan 2017 17:19:21 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 22262 invoked by uid 89); 30 Jan 2017 17:19:21 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=1.8 required=5.0 tests=BAYES_50, KAM_LAZY_DOMAIN_SECURITY, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=no version=3.3.2 spammy=entering, heh, ties, sk:tree-ss X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 30 Jan 2017 17:19:18 +0000 Received: from smtp.corp.redhat.com (int-mx16.intmail.prod.int.phx2.redhat.com [10.5.11.28]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C7030201FB; Mon, 30 Jan 2017 17:19:18 +0000 (UTC) Received: from abulafia.quesejoda.com (ovpn-117-102.ams2.redhat.com [10.36.117.102]) by smtp.corp.redhat.com (Postfix) with ESMTP id E381716A7C; Mon, 30 Jan 2017 17:19:16 +0000 (UTC) Subject: Re: [PR tree-optimization/71691] Fix unswitching in presence of maybe-undef SSA_NAMEs (take 2) To: Richard Biener References: <5e79620e-8adb-7768-a34e-32fd286995f0@redhat.com> <1e426a06-270b-07fe-0e95-ef3382b614c2@redhat.com> <0ff3a68e-3e2c-c0df-8c48-b02dac40d621@redhat.com> <39c1ceff-5584-210c-6ebc-12e8d234cffb@redhat.com> <50a9aa36-16ff-49f0-59fb-9bb08733b94a@redhat.com> Cc: gcc-patches , Jeff Law From: Aldy Hernandez Message-ID: Date: Mon, 30 Jan 2017 12:19:14 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0 MIME-Version: 1.0 In-Reply-To: On 01/30/2017 10:03 AM, Richard Biener wrote: > On Fri, Jan 27, 2017 at 12:20 PM, Aldy Hernandez wrote: >> On 01/26/2017 07:29 AM, Richard Biener wrote: >>> >>> On Thu, Jan 26, 2017 at 1:04 PM, Aldy Hernandez wrote: >>>> >>>> On 01/24/2017 07:23 AM, Richard Biener wrote: >> >> >>> Your initial on-demand approach is fine to catch some of the cases but it >>> will not handle those for which we'd need post-dominance. >>> >>> I guess we can incrementally add that. >> >> >> No complaints from me. >> >> This is my initial on-demand approach, with a few fixes you've commented on >> throughout. >> >> As you can see, there is still an #if 0 wrt to using your suggested >> conservative handling of memory loads, which I'm not entirely convinced of, >> as it pessimizes gcc.dg/loop-unswitch-1.c. If you feel strongly about it, I >> can enable the code again. > > It is really required -- fortunately loop-unswitch-1.c is one of the cases where > the post-dom / always-executed bits help . The comparison is inside the > loop header and thus always executed when the loop enters, so inserrting it > on the preheader edge is fine. Left as is then. > >> Also, I enhanced gcc.dg/loop-unswitch-1.c to verify that we're actually >> unswitching something. It seems kinda silly that we have various unswitch >> tests, but we don't actually check whether we have unswitched anything. > > Heh. It probably was added for an ICE... > >> This test was the only one in the *unswitch*.c set that I saw was actually >> being unswitched. Of course, if you don't agree with my #if 0 above, I will >> remove this change to the test. >> >> Finally, are we even guaranteed to unswitch in loop-unswitch-1.c across >> architectures? If not, again, I can remove the one-liner. > > I think so. Left as well. > >> >> How does this look for trunk? > > With a unswitch-local solution I meant to not add new files but put the > defined_or_undefined class (well, or rather a single function...) into > tree-ssa-loop-unswitch.c. Done. > > @@ -138,7 +141,7 @@ tree_may_unswitch_on (basic_block bb, struct loop *loop) > { > /* Unswitching on undefined values would introduce undefined > behavior that the original program might never exercise. */ > - if (ssa_undefined_value_p (use, true)) > + if (defined_or_undefined->is_maybe_undefined (use)) > return NULL_TREE; > def = SSA_NAME_DEF_STMT (use); > def_bb = gimple_bb (def); > > as I said, moving this (now more expensive check) after > > if (def_bb > && flow_bb_inside_loop_p (loop, def_bb)) > return NULL_TREE; > > this cheap check would be better. It should avoid 99% of all calls I bet. Done. > > You can recover the loop-unswitch-1.c case by passing down > the using stmt and checking its BB against loop_header (the only > block that we trivially know is always executed when entering the region). > Or do that check in the caller, like > > if (bb != loop->header > && ssa_undefined_value_p (use, true) / > defined_or_undefined->is_maybe_undefined (use)) Done in callee. > > + gimple *def = SSA_NAME_DEF_STMT (t); > + > + /* Check that all the PHI args are fully defined. */ > + if (gphi *phi = dyn_cast (def)) > + { > + if (virtual_operand_p (PHI_RESULT (phi))) > + continue; > > You should never run into virtual operands (you only walk SSA_OP_USEs). Done. > > You can stop walking at stmts that dominate the region header, > like with > > + gimple *def = SSA_NAME_DEF_STMT (t); > /* Uses in stmts always executed when the region header > executes are fine. */ > if (dominated_by_p (CDI_DOMINATORS, loop_header, gimple_bb (def))) > continue; Hmmmm... doing this causes the PR testcase (gcc.dg/loop-unswitch-5.c in the attached patch to FAIL). I haven't looked at it, but I seem to recall in the testcase that we could have a DEF that dominated the loop but was a mess of PHI's, some of whose args were undefined. Did you perhaps want to put that dominated_by_p call after the PHI arg checks (which seems to work)?: /* Check that all the PHI args are fully defined. */ if (gphi *phi = dyn_cast (def)) ... ... ... + /* Uses in stmts always executed when the region header executes + are fine. */ + if (dominated_by_p (CDI_DOMINATORS, loop->header, gimple_bb (def))) + continue; + /* Handle calls and memory loads conservatively. */ if (!is_gimple_assign (def) || (gimple_assign_single_p (def) Until this is clear, I've left this dominated_by_p call #if 0'ed out. > > and the bail out for PARM_DECLs is wrong: > > + /* A PARM_DECL will not have an SSA_NAME_DEF_STMT. Parameters > + get their initial value from function entry. */ > + if (SSA_NAME_VAR (t) && TREE_CODE (SSA_NAME_VAR (t)) == PARM_DECL) > + return false; > > needs to be a continue; rather than a return false. Done. Preliminary test show the attached patch works. Further tests on-going. Aldy gcc/ PR tree-optimization/71691 * bitmap.h (class auto_bitmap): New. * tree-ssa-loop-unswitch.c (tree_may_unswitch_on): Call is_maybe_undefined instead of ssa_undefined_value_p. gcc/testsuite/ PR tree-optimization/71691 * gcc.dg/loop-unswitch-5.c: Test that we actually unswitch a loop. diff --git a/gcc/bitmap.h b/gcc/bitmap.h index 196738b..f158b44 100644 --- a/gcc/bitmap.h +++ b/gcc/bitmap.h @@ -802,4 +802,25 @@ bmp_iter_and_compl (bitmap_iterator *bi, unsigned *bit_no) bmp_iter_and_compl (&(ITER), &(BITNUM)); \ bmp_iter_next (&(ITER), &(BITNUM))) +/* A class that ties the lifetime of a bitmap to its scope. */ +class auto_bitmap +{ + public: + auto_bitmap () { bits = BITMAP_ALLOC (NULL); } + ~auto_bitmap () { BITMAP_FREE (bits); } + // Allow calling bitmap functions on our bitmap. + operator bitmap () { return bits; } + + private: + // Prevent making a copy that references our bitmap. + auto_bitmap (const auto_bitmap &); + auto_bitmap &operator = (const auto_bitmap &); +#if __cplusplus >= 201103L + auto_bitmap (auto_bitmap &&); + auto_bitmap &operator = (auto_bitmap &&); +#endif + + bitmap bits; +}; + #endif /* GCC_BITMAP_H */ diff --git a/gcc/testsuite/gcc.dg/loop-unswitch-1.c b/gcc/testsuite/gcc.dg/loop-unswitch-1.c index 930364c..f6fc41d 100644 --- a/gcc/testsuite/gcc.dg/loop-unswitch-1.c +++ b/gcc/testsuite/gcc.dg/loop-unswitch-1.c @@ -1,6 +1,6 @@ /* For PR rtl-optimization/27735 */ /* { dg-do compile } */ -/* { dg-options "-O2 -funswitch-loops" } */ +/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details" } */ void set_color(void); void xml_colorize_line(unsigned int *p, int state) @@ -32,3 +32,5 @@ parse_tag: ; } } +/* Test that we actually unswitched something. */ +/* { dg-final { scan-tree-dump ";; Unswitching loop" "unswitch" } } */ diff --git a/gcc/testsuite/gcc.dg/loop-unswitch-5.c b/gcc/testsuite/gcc.dg/loop-unswitch-5.c new file mode 100644 index 0000000..b41e853 --- /dev/null +++ b/gcc/testsuite/gcc.dg/loop-unswitch-5.c @@ -0,0 +1,51 @@ +/* PR middle-end/71691 */ +/* { dg-do run } */ +/* { dg-options "-fno-tree-vrp -O2 -funswitch-loops -fdump-tree-unswitch-details" } */ + +/* Note: The -fno-tree-vrp above is only there to avoid VRP papering + over the problem. */ + +char b; +short f; +unsigned e; +int g = 20; + +void +foo () +{ + int l, h; + for (l = 0; l <= 7; l++) + { + int j = 38; + if (g) + h = 0; + for (; h <= 7; h++) + { + int i, k = b % (j % 4); + g = f; + for (;;) + { + j = 6 || b; + if (e) + { + for (; j; --j) + if (k) + __builtin_printf ("%d", 9); + if (i) + __builtin_printf ("%d", j); + } + if (l) + continue; + break; + } + i = f || b; + } + } +} + +int +main () +{ + foo (); + return 0; +} diff --git a/gcc/tree-ssa-loop-unswitch.c b/gcc/tree-ssa-loop-unswitch.c index 92599fb..28ae7dc 100644 --- a/gcc/tree-ssa-loop-unswitch.c +++ b/gcc/tree-ssa-loop-unswitch.c @@ -109,6 +109,84 @@ tree_ssa_unswitch_loops (void) return 0; } +/* Return TRUE if an SSA_NAME maybe undefined and is therefore + unsuitable for unswitching. STMT is the statement we are + considering for unswitching and LOOP is the loop it appears in. */ + +static bool +is_maybe_undefined (const tree name, gimple *stmt, struct loop *loop) +{ + /* The loop header is the only block we can trivially determine that + will always be executed. If the comparison is in the loop + header, we know it's OK to unswitch on it. */ + if (gimple_bb (stmt) == loop->header) + return false; + + auto_bitmap visited_ssa; + auto_vec worklist; + worklist.safe_push (name); + bitmap_set_bit (visited_ssa, SSA_NAME_VERSION (name)); + while (!worklist.is_empty ()) + { + tree t = worklist.pop (); + + /* If it's obviously undefined, avoid further computations. */ + if (ssa_undefined_value_p (t, true)) + return true; + + /* A PARM_DECL will not have an SSA_NAME_DEF_STMT. Parameters + get their initial value from function entry. */ + if (SSA_NAME_VAR (t) && TREE_CODE (SSA_NAME_VAR (t)) == PARM_DECL) + continue; + + gimple *def = SSA_NAME_DEF_STMT (t); + +#if 0 + /* Uses in stmts always executed when the region header executes + are fine. */ + if (dominated_by_p (CDI_DOMINATORS, loop->header, gimple_bb (def))) + continue; +#endif + + /* Check that all the PHI args are fully defined. */ + if (gphi *phi = dyn_cast (def)) + { + for (unsigned i = 0; i < gimple_phi_num_args (phi); ++i) + { + tree t = gimple_phi_arg_def (phi, i); + /* If an SSA has already been seen, it may be a loop, + but we can continue and ignore this use. Otherwise, + add the SSA_NAME to the queue and visit it later. */ + if (TREE_CODE (t) == SSA_NAME + && bitmap_set_bit (visited_ssa, SSA_NAME_VERSION (t))) + worklist.safe_push (t); + } + continue; + } + + /* Handle calls and memory loads conservatively. */ + if (!is_gimple_assign (def) + || (gimple_assign_single_p (def) + && gimple_vuse (def))) + return true; + + /* Check that any SSA names used to define NAME are also fully + defined. */ + use_operand_p use_p; + ssa_op_iter iter; + FOR_EACH_SSA_USE_OPERAND (use_p, def, iter, SSA_OP_USE) + { + tree t = USE_FROM_PTR (use_p); + /* If an SSA has already been seen, it may be a loop, + but we can continue and ignore this use. Otherwise, + add the SSA_NAME to the queue and visit it later. */ + if (bitmap_set_bit (visited_ssa, SSA_NAME_VERSION (t))) + worklist.safe_push (t); + } + } + return false; +} + /* Checks whether we can unswitch LOOP on condition at end of BB -- one of its basic blocks (for what it means see comments below). */ @@ -136,15 +214,15 @@ tree_may_unswitch_on (basic_block bb, struct loop *loop) /* Condition must be invariant. */ FOR_EACH_SSA_TREE_OPERAND (use, stmt, iter, SSA_OP_USE) { - /* Unswitching on undefined values would introduce undefined - behavior that the original program might never exercise. */ - if (ssa_undefined_value_p (use, true)) - return NULL_TREE; def = SSA_NAME_DEF_STMT (use); def_bb = gimple_bb (def); if (def_bb && flow_bb_inside_loop_p (loop, def_bb)) return NULL_TREE; + /* Unswitching on undefined values would introduce undefined + behavior that the original program might never exercise. */ + if (is_maybe_undefined (use, stmt, loop)) + return NULL_TREE; } cond = build2 (gimple_cond_code (stmt), boolean_type_node,