diff mbox series

var-tracking: Mark as sp based more VALUEs [PR92264]

Message ID 20200320100427.GV2156@tucnak
State New
Headers show
Series var-tracking: Mark as sp based more VALUEs [PR92264] | expand

Commit Message

Li, Pan2 via Gcc-patches March 20, 2020, 10:04 a.m. UTC
Hi!

With this simple patch, on i686-linux and x86_64-linux with -m32 (no change
for -m64), the find_base_term visited_vals.length () > 100 find_base_term
statistics changed (fbt is before this patch, fbt2 with this patch):
for k in '' '1$'; do for i in 32 64; do for j in fbt fbt2; do \
echo -n "$j $i $k "; LC_ALL=C grep ^$i.*"$k" $j | wc -l; done; done; done
fbt 32  5313355
fbt2 32  4229854
fbt 64  217523
fbt2 64  217523
fbt 32 1$ 1296
fbt2 32 1$ 407
fbt 64 1$ 0
fbt2 64 1$ 0
For frame_pointer_needed functions, we need to wait until we see the
fp_setter insn in the prologue at which point we disassociate the fp based
VALUEs from sp based VALUEs, but for !frame_pointer_needed functions,
we IMHO don't need to wait anything.  For ACCUMULATE_OUTGOING_ARGS it isn't
IMHO worth doing anything, as there is a single sp adjustment and so there
is no risk of many thousands deep VALUE chains, but for
!ACCUMULATE_OUTGOING_ARGS the sp keeps changing constantly.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Perhaps even better would be if we'd be able already in cselib somehow
figure out these sp based VALUEs and be able to constantly reuse them and in
loc list for them remember (apart from the registers etc. that hold the
VALUE ATM) have just (plus:P (value:P sp0) (const_int NNN)) where sp0 would
be a VALUE of stack pointer at the start of !frame_pointer_needed functions
and for frame_pointer_needed functions say sp VALUE at the point of
fp_setter insn.  Then if we have
sp -= 4
sp -= 4
sp -= 4
sp += 8
sp -= 4
sp += 8
sp -= 4
sp -= 4
sp += 8
we wouldn't need 10 sp based VALUEs, but only one for orig sp, one for
orig_sp-4, another one for orig_sp-8, another one for orig_sp-12 and that's
it.

2020-03-19  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/92264
	* var-tracking.c (add_stores): Call cselib_set_value_sp_based even
	for sp based values in !frame_pointer_needed
	&& !ACCUMULATE_OUTGOING_ARGS functions.


	Jakub

Comments

Richard Biener March 20, 2020, 10:17 a.m. UTC | #1
On Fri, 20 Mar 2020, Jakub Jelinek wrote:

> Hi!
> 
> With this simple patch, on i686-linux and x86_64-linux with -m32 (no change
> for -m64), the find_base_term visited_vals.length () > 100 find_base_term
> statistics changed (fbt is before this patch, fbt2 with this patch):
> for k in '' '1$'; do for i in 32 64; do for j in fbt fbt2; do \
> echo -n "$j $i $k "; LC_ALL=C grep ^$i.*"$k" $j | wc -l; done; done; done
> fbt 32  5313355
> fbt2 32  4229854
> fbt 64  217523
> fbt2 64  217523
> fbt 32 1$ 1296
> fbt2 32 1$ 407
> fbt 64 1$ 0
> fbt2 64 1$ 0
> For frame_pointer_needed functions, we need to wait until we see the
> fp_setter insn in the prologue at which point we disassociate the fp based
> VALUEs from sp based VALUEs, but for !frame_pointer_needed functions,
> we IMHO don't need to wait anything.  For ACCUMULATE_OUTGOING_ARGS it isn't
> IMHO worth doing anything, as there is a single sp adjustment and so there
> is no risk of many thousands deep VALUE chains, but for
> !ACCUMULATE_OUTGOING_ARGS the sp keeps changing constantly.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> Perhaps even better would be if we'd be able already in cselib somehow
> figure out these sp based VALUEs and be able to constantly reuse them and in
> loc list for them remember (apart from the registers etc. that hold the
> VALUE ATM) have just (plus:P (value:P sp0) (const_int NNN)) where sp0 would
> be a VALUE of stack pointer at the start of !frame_pointer_needed functions
> and for frame_pointer_needed functions say sp VALUE at the point of
> fp_setter insn.  Then if we have
> sp -= 4
> sp -= 4
> sp -= 4
> sp += 8
> sp -= 4
> sp += 8
> sp -= 4
> sp -= 4
> sp += 8
> we wouldn't need 10 sp based VALUEs, but only one for orig sp, one for
> orig_sp-4, another one for orig_sp-8, another one for orig_sp-12 and that's
> it.

That would indeed be great - and I'd expect value-numbering to do such,
why doesn't it work like this in cselib?

I'm not knowledgable enough to review the actual patch, in particular
whether we can rely on frame_pointer_needed or ACCUMULATE_OUTGOING_ARGS
to actually reflect what the target does.

> 2020-03-19  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR rtl-optimization/92264
> 	* var-tracking.c (add_stores): Call cselib_set_value_sp_based even
> 	for sp based values in !frame_pointer_needed
> 	&& !ACCUMULATE_OUTGOING_ARGS functions.
> 
> --- gcc/var-tracking.c.jj	2020-01-12 11:54:38.532381439 +0100
> +++ gcc/var-tracking.c	2020-03-19 15:49:19.457340470 +0100
> @@ -6112,7 +6112,8 @@ add_stores (rtx loc, const_rtx expr, voi
>      }
>  
>    if (loc == stack_pointer_rtx
> -      && maybe_ne (hard_frame_pointer_adjustment, -1)
> +      && (maybe_ne (hard_frame_pointer_adjustment, -1)
> +	  || (!frame_pointer_needed && !ACCUMULATE_OUTGOING_ARGS))
>        && preserve)
>      cselib_set_value_sp_based (v);
>  
> 
> 	Jakub
> 
>
Li, Pan2 via Gcc-patches March 25, 2020, 10:33 p.m. UTC | #2
On Fri, 2020-03-20 at 11:04 +0100, Jakub Jelinek via Gcc-patches wrote:
> Hi!
> 
> With this simple patch, on i686-linux and x86_64-linux with -m32 (no change
> for -m64), the find_base_term visited_vals.length () > 100 find_base_term
> statistics changed (fbt is before this patch, fbt2 with this patch):
> for k in '' '1$'; do for i in 32 64; do for j in fbt fbt2; do \
> echo -n "$j $i $k "; LC_ALL=C grep ^$i.*"$k" $j | wc -l; done; done; done
> fbt 32  5313355
> fbt2 32  4229854
> fbt 64  217523
> fbt2 64  217523
> fbt 32 1$ 1296
> fbt2 32 1$ 407
> fbt 64 1$ 0
> fbt2 64 1$ 0
> For frame_pointer_needed functions, we need to wait until we see the
> fp_setter insn in the prologue at which point we disassociate the fp based
> VALUEs from sp based VALUEs, but for !frame_pointer_needed functions,
> we IMHO don't need to wait anything.  For ACCUMULATE_OUTGOING_ARGS it isn't
> IMHO worth doing anything, as there is a single sp adjustment and so there
> is no risk of many thousands deep VALUE chains, but for
> !ACCUMULATE_OUTGOING_ARGS the sp keeps changing constantly.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> Perhaps even better would be if we'd be able already in cselib somehow
> figure out these sp based VALUEs and be able to constantly reuse them and in
> loc list for them remember (apart from the registers etc. that hold the
> VALUE ATM) have just (plus:P (value:P sp0) (const_int NNN)) where sp0 would
> be a VALUE of stack pointer at the start of !frame_pointer_needed functions
> and for frame_pointer_needed functions say sp VALUE at the point of
> fp_setter insn.  Then if we have
> sp -= 4
> sp -= 4
> sp -= 4
> sp += 8
> sp -= 4
> sp += 8
> sp -= 4
> sp -= 4
> sp += 8
> we wouldn't need 10 sp based VALUEs, but only one for orig sp, one for
> orig_sp-4, another one for orig_sp-8, another one for orig_sp-12 and that's
> it.
> 
> 2020-03-19  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR rtl-optimization/92264
> 	* var-tracking.c (add_stores): Call cselib_set_value_sp_based even
> 	for sp based values in !frame_pointer_needed
> 	&& !ACCUMULATE_OUTGOING_ARGS functions.
OK.  
jeff
>
diff mbox series

Patch

--- gcc/var-tracking.c.jj	2020-01-12 11:54:38.532381439 +0100
+++ gcc/var-tracking.c	2020-03-19 15:49:19.457340470 +0100
@@ -6112,7 +6112,8 @@  add_stores (rtx loc, const_rtx expr, voi
     }
 
   if (loc == stack_pointer_rtx
-      && maybe_ne (hard_frame_pointer_adjustment, -1)
+      && (maybe_ne (hard_frame_pointer_adjustment, -1)
+	  || (!frame_pointer_needed && !ACCUMULATE_OUTGOING_ARGS))
       && preserve)
     cselib_set_value_sp_based (v);