Fix up wi::lrshift (PR c++/69399)

On Tue, Jan 26, 2016 at 09:45:19PM +0100, Richard Biener wrote:
> On January 26, 2016 8:00:52 PM GMT+01:00, Mike Stump <mikestump@comcast.net> wrote:
> >On Jan 26, 2016, at 10:21 AM, Jakub Jelinek <jakub@redhat.com> wrote
> >> The question is, shall we do what wi::lshift does and have the fast
> >path
> >> only for the constant shifts, as the untested patch below does, or
> >shall we
> >> check shift dynamically (thus use
> >> shift < HOST_BITS_PER_WIDE_INT
> >> instead of
> >> STATIC_CONSTANT_P (shift < HOST_BITS_PER_WIDE_INT)
> >> in the patch),
> >
> >Hum…  I think I prefer the dynamic check.  The reasoning is that when
> >we fast path, we can tolerate the conditional branch to retain the fast
> >path, as most of the time, that condition will usually be true.  If the
> >compiler had troubles with knowing the usual truth value of the
> >expression, seems like we can hint that it will be true and influence
> >the static prediction of the branch.  This permits us to fast path
> >almost all the time in the non-constant, but small enough case.  For
> >known shifts, there is no code gen difference, so it doesn’t matter.
> 
> The original reasoning was to inline only the fast path if it is known at compile time and otherwise have a call. Exactly to avoid bloating callers with inlined conditionals.

I'm now also bootstrapping/regtesting following patch, the previous one
passed bootstrap/regtest, and will do cc1plus size comparison afterwards.
That said, I have done a quick check, where I believe that unless xi and
shift are both compile time constants, for the
STATIC_CONSTANT_P (xi.precision > HOST_BITS_PER_WIDE_INT)
case there should be some comparison and the lrshift_large call with
the non-STATIC_CONSTANT_P variant, but in the bootstrapped (non-LTO)
cc1plus I only see 14 calls to lrshift_large, thus I bet it will likely affect
only <= 14 places right now:

0000000000776990 <_ZL11build_new_1PP3vecIP9tree_node5va_gc8vl_embedES1_S1_S6_bi>:
  777ca4:	e8 17 28 91 00       	callq  108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj>
--
00000000008b8bc0 <_ZL31adjust_offset_for_component_refP9tree_nodePbPl.part.91>:
  8b8cc2:	e8 f9 17 7d 00       	callq  108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj>
--
0000000000a7b550 <_ZN2wi7rrotateIPK9tree_node16generic_wide_intI16wide_int_storageEEENS_12unary_traitsIT_E11result_typeERKS8_RKT0_j>:
  a7baea:	e8 d1 e9 60 00       	callq  108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj>
0000000000a7bd90 <_ZN2wi7lrotateIPK9tree_node16generic_wide_intI16wide_int_storageEEENS_12unary_traitsIT_E11result_typeERKS8_RKT0_j>:
  a7c457:	e8 64 e0 60 00       	callq  108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj>
--
0000000000a7c710 <_ZN2wi7lrshiftIP9tree_nodelEENS_12unary_traitsIT_E11result_typeERKS4_RKT0_>:
  a7c851:	e8 6a dc 60 00       	callq  108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj>
--
0000000000a7d280 <_ZN2wi7lrshiftIPK9tree_node16generic_wide_intI16wide_int_storageEEENS_12unary_traitsIT_E11result_typeERKS8_RKT0_>:
  a7d3b9:	e8 02 d1 60 00       	callq  108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj>
--
0000000000cc1370 <_Z15real_to_integerPK10real_valuePbi>:
  cc1752:	e8 69 8d 3c 00       	callq  108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj>
--
0000000000cc27f0 <_Z17real_from_integerP10real_value13format_helperRK16generic_wide_intI20wide_int_ref_storageILb0EEE6signop>:
  cc2f05:	e8 b6 75 3c 00       	callq  108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj>
--
0000000000d6c420 <_Z31simplify_const_binary_operation8rtx_code12machine_modeP7rtx_defS2_>:
  d6f5a5:	e8 16 af 31 00       	callq  108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj>
--
0000000000d7ca40 <_ZN2wi7lrotateISt4pairIP7rtx_def12machine_modeES5_EENS_12unary_traitsIT_E11result_typeERKS7_RKT0_j>:
  d7d17a:	e8 41 d3 30 00       	callq  108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj>
0000000000d7d310 <_ZN2wi7rrotateISt4pairIP7rtx_def12machine_modeES5_EENS_12unary_traitsIT_E11result_typeERKS7_RKT0_j>:
  d7d99d:	e8 1e cb 30 00       	callq  108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj>
--
0000000000ea3c40 <_ZN2wi7lrshiftI16generic_wide_intI22fixed_wide_int_storageILi192EEES4_EENS_12unary_traitsIT_E11result_typeERKS6_RKT0_>:
  ea3ca7:	e8 14 68 1e 00       	callq  108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj>
--
0000000000f435a0 <_ZL27copy_reference_ops_from_refP9tree_nodeP3vecI22vn_reference_op_struct7va_heap6vl_ptrE>:
  f444b2:	e8 09 60 14 00       	callq  108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj>
--
000000000161c840 <_ZL21restructure_referencePP9tree_nodeS1_P16generic_wide_intI22fixed_wide_int_storageILi192EEES1_.constprop.133>:
 161cb51:	e8 6a d9 a6 ff       	callq  108a4c0 <_ZN2wi13lrshift_largeEPlPKljjjj>

2016-01-26  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/69399
	* wide-int.h (wi::lrshift): For larger precisions, only
	use fast path if shift is known to be < HOST_BITS_PER_WIDE_INT.

	* gcc.dg/torture/pr69399.c: New test.

	Jakub

Fix up wi::lrshift (PR c++/69399)

Commit Message

Comments

Patch