[RFC] Use a local edge flag instead of a sbitmap in RPO compute
diff mbox series

Message ID nycvar.YFH.7.76.1911210922140.5566@zhemvz.fhfr.qr
State New
Headers show
Series
  • [RFC] Use a local edge flag instead of a sbitmap in RPO compute
Related show

Commit Message

Richard Biener Nov. 21, 2019, 8:33 a.m. UTC
This gets away with the top caller of sbitmap_alloc in a tramp3d
compile which is pre_and_rev_post_order_compute_fn (yeah, we're
invoking that way too often).  I've introduced those auto_*_flag
things for region based VN but I think it makes sense to use
those also for the non-region case.

Code-size of pre_and_rev_post_order_compute_fn increases with the
patch though, likely for allocation/deallocation code of the
flag which is inlined and due to the extra tail loop.  Memory
locality should improve here though (->flags is in the same
cache line as ->index which we access anyway) - but that's 
maybe wishful thinking, micro-benchmarking this seems hard and
large vs. small CFGs might tell different stories as well.

Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?

Thanks,
Richard.

2019-11-21  Richard Biener  <rguenther@suse.de>

	* cfganal.c (pre_and_rev_post_order_compute_fn): Use an
	auto_bb_flag instead of an sbitmap for visited handling.

Comments

Jeff Law Nov. 21, 2019, 10:40 p.m. UTC | #1
On 11/21/19 1:33 AM, Richard Biener wrote:
> 
> This gets away with the top caller of sbitmap_alloc in a tramp3d
> compile which is pre_and_rev_post_order_compute_fn (yeah, we're
> invoking that way too often).  I've introduced those auto_*_flag
> things for region based VN but I think it makes sense to use
> those also for the non-region case.
> 
> Code-size of pre_and_rev_post_order_compute_fn increases with the
> patch though, likely for allocation/deallocation code of the
> flag which is inlined and due to the extra tail loop.  Memory
> locality should improve here though (->flags is in the same
> cache line as ->index which we access anyway) - but that's 
> maybe wishful thinking, micro-benchmarking this seems hard and
> large vs. small CFGs might tell different stories as well.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?
> 
> Thanks,
> Richard.
> 
> 2019-11-21  Richard Biener  <rguenther@suse.de>
> 
> 	* cfganal.c (pre_and_rev_post_order_compute_fn): Use an
> 	auto_bb_flag instead of an sbitmap for visited handling.
LGTM.
jeff

Patch
diff mbox series

Index: gcc/cfganal.c
===================================================================
--- gcc/cfganal.c	(revision 278496)
+++ gcc/cfganal.c	(working copy)
@@ -967,11 +967,8 @@  pre_and_rev_post_order_compute_fn (struc
   else
     rev_post_order_num -= NUM_FIXED_BLOCKS;
 
-  /* Allocate bitmap to track nodes that have been visited.  */
-  auto_sbitmap visited (last_basic_block_for_fn (fn));
-
-  /* None of the nodes in the CFG have been visited yet.  */
-  bitmap_clear (visited);
+  /* BB flag to track nodes that have been visited.  */
+  auto_bb_flag visited (fn);
 
   /* Push the first edge on to the stack.  */
   stack.quick_push (ei_start (ENTRY_BLOCK_PTR_FOR_FN (fn)->succs));
@@ -988,10 +985,10 @@  pre_and_rev_post_order_compute_fn (struc
 
       /* Check if the edge destination has been visited yet.  */
       if (dest != EXIT_BLOCK_PTR_FOR_FN (fn)
-	  && ! bitmap_bit_p (visited, dest->index))
+	  && ! (dest->flags & visited))
 	{
 	  /* Mark that we have visited the destination.  */
-	  bitmap_set_bit (visited, dest->index);
+	  dest->flags |= visited;
 
 	  if (pre_order)
 	    pre_order[pre_order_num] = dest->index;
@@ -1032,6 +1029,10 @@  pre_and_rev_post_order_compute_fn (struc
 	rev_post_order[rev_post_order_num--] = ENTRY_BLOCK;
     }
 
+  /* Clear the temporarily allocated flag.  */
+  for (int i = 0; i < pre_order_num; ++i)
+    BASIC_BLOCK_FOR_FN (fn, rev_post_order[i])->flags &= ~visited;
+
   return pre_order_num;
 }