diff mbox

Speedup int_bit_from_pos

Message ID 20140920160752.GA18043@kam.mff.cuni.cz
State New
Headers show

Commit Message

Jan Hubicka Sept. 20, 2014, 4:07 p.m. UTC
> On 09/19/14 22:04, Jan Hubicka wrote:
> >Hi,
> >int_bit_position is used by ipa-devirt's type walking code.  It is currently a bottleneck
> >since I introduced speculation into contextes (I plan to solve this by changing the
> >way i cache results). But this patch seems to make sense anyway: we do not need to go
> >through folding:
> >tree
> >bit_from_pos (tree offset, tree bitpos)
> >{
> >   if (TREE_CODE (offset) == PLUS_EXPR)
> >     offset = size_binop (PLUS_EXPR,
> >                          fold_convert (bitsizetype, TREE_OPERAND (offset, 0)),
> >                          fold_convert (bitsizetype, TREE_OPERAND (offset, 1)));
> >   else
> >     offset = fold_convert (bitsizetype, offset);
> >   return size_binop (PLUS_EXPR, bitpos,
> >                      size_binop (MULT_EXPR, offset, bitsize_unit_node));
> >}
> >
> >Because all the code cares only about constant offsets, we do not need to go through fold_convert,
> >because all the codes go via int_bit_position that already expects result to be host wide int,
> >it seems to make sense to implement quick path for that.
> >
> >Bootstrap/regtest x86_64 in progress, OK?
> >
> >Honza
> >
> >	* stor-layout.c (int_bit_from_pos): New function.
> >	* stor-layout.h (int_bit_from_pos): Declare.
> >	* tree.c (int_bit_from_pos): Use it.
> Just as a note to anyone else that peeks at this code, tree_to_shwi
> will verify the nodes are constant during a checking build.
> 
> I'd consider a different name for the function that somehow
> indicates the inputs are constants.
> 
> jeff
> 
> Please consider an assert or other checking code to ensure that
> OFFSET and BITPOS are constants.  Oh, I see that tree_to_shwi will
> get that checking when it
> 
> Jeff

Yep, tree_to_shwi will check it.  The old code did generic expression folding and
called tree_to_shwi on result, so the only difference is that old code will accept
unfolded expressions that miraculously folds into constant.  I think it is bug to
have those in IL especially on places we do not expect variable offsets.

Based on that observation, I think we can also drop handling of PLUS_EXPR in this case
as follows.

Concerning the function, it has documented in toplevel comment that parameter must
be constant or it crashes, so I think it is fine. Conerning name, I am open for renaming,
but we have those int_* variants in quite few copies, so I can do that incrementally
(see int_byte_position and related functions in stor layout).

I am testing the following simplified (and inline) variant.
Perhaps we could do similar stuff for other int_* accessors even if they do not
sit on hot paths in my test, just for the sake of code size.

Honza

Comments

Jan Hubicka Sept. 20, 2014, 6:29 p.m. UTC | #1
> 
> Yep, tree_to_shwi will check it.  The old code did generic expression folding and
> called tree_to_shwi on result, so the only difference is that old code will accept
> unfolded expressions that miraculously folds into constant.  I think it is bug to
> have those in IL especially on places we do not expect variable offsets.
> 
> Based on that observation, I think we can also drop handling of PLUS_EXPR in this case
> as follows.
> 
> Concerning the function, it has documented in toplevel comment that parameter must
> be constant or it crashes, so I think it is fine. Conerning name, I am open for renaming,
> but we have those int_* variants in quite few copies, so I can do that incrementally
> (see int_byte_position and related functions in stor layout).
> 
> I am testing the following simplified (and inline) variant.
> Perhaps we could do similar stuff for other int_* accessors even if they do not
> sit on hot paths in my test, just for the sake of code size.

Note that I benchmarked libxul build. This patch makes devirtualization to go from
70% of compile time to 30%, so it is about twice as fast.

I will care the rest by caching reorg (it went up from 4% to 70% by introducing the
speculation)

Honza
Richard Biener Sept. 21, 2014, 8:45 a.m. UTC | #2
On September 20, 2014 6:07:52 PM CEST, Jan Hubicka <hubicka@ucw.cz> wrote:
>> On 09/19/14 22:04, Jan Hubicka wrote:
>> >Hi,
>> >int_bit_position is used by ipa-devirt's type walking code.  It is
>currently a bottleneck
>> >since I introduced speculation into contextes (I plan to solve this
>by changing the
>> >way i cache results). But this patch seems to make sense anyway: we
>do not need to go
>> >through folding:
>> >tree
>> >bit_from_pos (tree offset, tree bitpos)
>> >{
>> >   if (TREE_CODE (offset) == PLUS_EXPR)
>> >     offset = size_binop (PLUS_EXPR,
>> >                          fold_convert (bitsizetype, TREE_OPERAND
>(offset, 0)),
>> >                          fold_convert (bitsizetype, TREE_OPERAND
>(offset, 1)));
>> >   else
>> >     offset = fold_convert (bitsizetype, offset);
>> >   return size_binop (PLUS_EXPR, bitpos,
>> >                      size_binop (MULT_EXPR, offset,
>bitsize_unit_node));
>> >}
>> >
>> >Because all the code cares only about constant offsets, we do not
>need to go through fold_convert,
>> >because all the codes go via int_bit_position that already expects
>result to be host wide int,
>> >it seems to make sense to implement quick path for that.
>> >
>> >Bootstrap/regtest x86_64 in progress, OK?
>> >
>> >Honza
>> >
>> >	* stor-layout.c (int_bit_from_pos): New function.
>> >	* stor-layout.h (int_bit_from_pos): Declare.
>> >	* tree.c (int_bit_from_pos): Use it.
>> Just as a note to anyone else that peeks at this code, tree_to_shwi
>> will verify the nodes are constant during a checking build.
>> 
>> I'd consider a different name for the function that somehow
>> indicates the inputs are constants.
>> 
>> jeff
>> 
>> Please consider an assert or other checking code to ensure that
>> OFFSET and BITPOS are constants.  Oh, I see that tree_to_shwi will
>> get that checking when it
>> 
>> Jeff
>
>Yep, tree_to_shwi will check it.  The old code did generic expression
>folding and
>called tree_to_shwi on result, so the only difference is that old code
>will accept
>unfolded expressions that miraculously folds into constant.  I think it
>is bug to
>have those in IL especially on places we do not expect variable
>offsets.
>
>Based on that observation, I think we can also drop handling of
>PLUS_EXPR in this case
>as follows.
>
>Concerning the function, it has documented in toplevel comment that
>parameter must
>be constant or it crashes, so I think it is fine. Conerning name, I am
>open for renaming,
>but we have those int_* variants in quite few copies, so I can do that
>incrementally
>(see int_byte_position and related functions in stor layout).
>
>I am testing the following simplified (and inline) variant.
>Perhaps we could do similar stuff for other int_* accessors even if
>they do not
>sit on hot paths in my test, just for the sake of code size.

Please omit static from inline functions.

Also one notable difference with your patches is that the fits hwi is now not tested on the result but on the result input which, multiplied by 8, might not fit a hwi now.  So please use wide-ints here (the to_offset flavor).

Richard.

>Honza
>
>Index: tree.c
>===================================================================
>--- tree.c	(revision 215421)
>+++ tree.c	(working copy)
>@@ -2831,16 +2831,6 @@ bit_position (const_tree field)
>   return bit_from_pos (DECL_FIELD_OFFSET (field),
> 		       DECL_FIELD_BIT_OFFSET (field));
> }
>-
>-/* Likewise, but return as an integer.  It must be representable in
>-   that way (since it could be a signed value, we don't have the
>-   option of returning -1 like int_size_in_byte can.  */
>-
>-HOST_WIDE_INT
>-int_bit_position (const_tree field)
>-{
>-  return tree_to_shwi (bit_position (field));
>-}
> >
>/* Return the byte position of FIELD, in bytes from the start of the
>record.
>    This is a tree of type sizetype.  */
>Index: tree.h
>===================================================================
>--- tree.h	(revision 215421)
>+++ tree.h	(working copy)
>@@ -3877,10 +3877,20 @@ extern tree size_in_bytes (const_tree);
> extern HOST_WIDE_INT int_size_in_bytes (const_tree);
> extern HOST_WIDE_INT max_int_size_in_bytes (const_tree);
> extern tree bit_position (const_tree);
>-extern HOST_WIDE_INT int_bit_position (const_tree);
> extern tree byte_position (const_tree);
> extern HOST_WIDE_INT int_byte_position (const_tree);
> 
>+/* Like bit_position, but return as an integer.  It must be
>representable in
>+   that way (since it could be a signed value, we don't have the
>+   option of returning -1 like int_size_in_byte can.  */
>+
>+static inline HOST_WIDE_INT int_bit_position (const_tree field)
>+{ 
>+  return tree_to_shwi (DECL_FIELD_OFFSET (field)) * BITS_PER_UNIT
>+	 + tree_to_shwi (DECL_FIELD_BIT_OFFSET (field));
>+}
>+
>+
> #define sizetype sizetype_tab[(int) stk_sizetype]
> #define bitsizetype sizetype_tab[(int) stk_bitsizetype]
> #define ssizetype sizetype_tab[(int) stk_ssizetype]
Jan Hubicka Sept. 21, 2014, 2:19 p.m. UTC | #3
> 
> Please omit static from inline functions.

Yep, I suppose we want to drop static in all inlines? I can make patch for that.
> 
> Also one notable difference with your patches is that the fits hwi is now not tested on the result but on the result input which, multiplied by 8, might not fit a hwi now.  So please use wide-ints here (the to_offset flavor).

The function must always suceed (so user promise it will fit in HWI) and for
performance reasons I would rather not go into wide int by defualt, but I can
do that with checking enabled.

Honza
Richard Biener Sept. 22, 2014, 7:52 a.m. UTC | #4
On Sun, 21 Sep 2014, Jan Hubicka wrote:

> > 
> > Please omit static from inline functions.
> 
> Yep, I suppose we want to drop static in all inlines? I can make patch for that.
> > 
> > Also one notable difference with your patches is that the fits hwi is now not tested on the result but on the result input which, multiplied by 8, might not fit a hwi now.  So please use wide-ints here (the to_offset flavor).
> 
> The function must always suceed (so user promise it will fit in HWI) and for
> performance reasons I would rather not go into wide int by defualt, but I can
> do that with checking enabled.

wide-int should be fast enough, please use it.

Richard.
diff mbox

Patch

Index: tree.c
===================================================================
--- tree.c	(revision 215421)
+++ tree.c	(working copy)
@@ -2831,16 +2831,6 @@  bit_position (const_tree field)
   return bit_from_pos (DECL_FIELD_OFFSET (field),
 		       DECL_FIELD_BIT_OFFSET (field));
 }
-
-/* Likewise, but return as an integer.  It must be representable in
-   that way (since it could be a signed value, we don't have the
-   option of returning -1 like int_size_in_byte can.  */
-
-HOST_WIDE_INT
-int_bit_position (const_tree field)
-{
-  return tree_to_shwi (bit_position (field));
-}
 
 /* Return the byte position of FIELD, in bytes from the start of the record.
    This is a tree of type sizetype.  */
Index: tree.h
===================================================================
--- tree.h	(revision 215421)
+++ tree.h	(working copy)
@@ -3877,10 +3877,20 @@  extern tree size_in_bytes (const_tree);
 extern HOST_WIDE_INT int_size_in_bytes (const_tree);
 extern HOST_WIDE_INT max_int_size_in_bytes (const_tree);
 extern tree bit_position (const_tree);
-extern HOST_WIDE_INT int_bit_position (const_tree);
 extern tree byte_position (const_tree);
 extern HOST_WIDE_INT int_byte_position (const_tree);
 
+/* Like bit_position, but return as an integer.  It must be representable in
+   that way (since it could be a signed value, we don't have the
+   option of returning -1 like int_size_in_byte can.  */
+
+static inline HOST_WIDE_INT int_bit_position (const_tree field)
+{ 
+  return tree_to_shwi (DECL_FIELD_OFFSET (field)) * BITS_PER_UNIT
+	 + tree_to_shwi (DECL_FIELD_BIT_OFFSET (field));
+}
+
+
 #define sizetype sizetype_tab[(int) stk_sizetype]
 #define bitsizetype sizetype_tab[(int) stk_bitsizetype]
 #define ssizetype sizetype_tab[(int) stk_ssizetype]