Patchwork combine_conversions int->double->int

login
register
mail settings
Submitter Marc Glisse
Date April 25, 2012, 8:12 a.m.
Message ID <alpine.DEB.2.02.1204250932210.2541@laptop-mg.saclay.inria.fr>
Download mbox | patch
Permalink /patch/154826/
State New
Headers show

Comments

Marc Glisse - April 25, 2012, 8:12 a.m.
Hello,

a conversion like int->double->int is just the identity, as long as double 
is big enough to represent all ints exactly. The most natural way I found 
to do this optimization is the attached:

2012-04-25  Marc Glisse  <marc.glisse@inria.fr>

 	PR middle-end/27139
 	* tree-ssa-forwprop.c (combine_conversions): Handle INT->FP->INT.

Does the approach make sense? I don't know that code, and adding 
FLOAT_EXPR / FIX_TRUNC_EXPR was a bit of guesswork. The precision of 
double could be multiplied by log2(base), but not doing it is safe. If 
the approach is ok, I could extend it so int->double->long also skips the 
intermediate conversion.

Bootstrapped and regression tested on linux-x86_64.

Should I try and write a testcase for a specific target checking for 
specific asm instructions there, or is there a better way?

(note that I can't commit)
Richard Guenther - April 25, 2012, 8:59 a.m.
On Wed, Apr 25, 2012 at 10:12 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
> Hello,
>
> a conversion like int->double->int is just the identity, as long as double
> is big enough to represent all ints exactly. The most natural way I found to
> do this optimization is the attached:
>
> 2012-04-25  Marc Glisse  <marc.glisse@inria.fr>
>
>        PR middle-end/27139
>        * tree-ssa-forwprop.c (combine_conversions): Handle INT->FP->INT.
>
> Does the approach make sense? I don't know that code, and adding FLOAT_EXPR
> / FIX_TRUNC_EXPR was a bit of guesswork. The precision of double could be
> multiplied by log2(base), but not doing it is safe. If the approach is ok, I
> could extend it so int->double->long also skips the intermediate conversion.
>
> Bootstrapped and regression tested on linux-x86_64.
>
> Should I try and write a testcase for a specific target checking for
> specific asm instructions there, or is there a better way?

Well, scanning the forwprop dump for example.

Btw, I think not munging this new case into the existing CONVERT_EXPR_P
code would be better - it makes the code (even) harder to understand and
I'm not convinced that adding FLOAT_EXPR/FIX_TRUNC_EXPR handling
does not wreck any assumptions in that code.

It also seems that for DECIMAL_FLOAT_TYPE_P the transform is always
valid?

Richard.

> (note that I can't commit)
>
> --
> Marc Glisse
Marc Glisse - April 25, 2012, 9:37 a.m.
First, thanks a lot for answering.

On Wed, 25 Apr 2012, Richard Guenther wrote:

> On Wed, Apr 25, 2012 at 10:12 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> Hello,
>>
>> a conversion like int->double->int is just the identity, as long as double
>> is big enough to represent all ints exactly. The most natural way I found to
>> do this optimization is the attached:
>>
>> 2012-04-25  Marc Glisse  <marc.glisse@inria.fr>
>>
>>        PR middle-end/27139
>>        * tree-ssa-forwprop.c (combine_conversions): Handle INT->FP->INT.
>>
>> Does the approach make sense? I don't know that code, and adding FLOAT_EXPR
>> / FIX_TRUNC_EXPR was a bit of guesswork. The precision of double could be
>> multiplied by log2(base), but not doing it is safe. If the approach is ok, I
>> could extend it so int->double->long also skips the intermediate conversion.
>>
>> Bootstrapped and regression tested on linux-x86_64.


>> Should I try and write a testcase for a specific target checking for
>> specific asm instructions there, or is there a better way?
>
> Well, scanning the forwprop dump for example.

No idea how that works, but with some grepping and looking at other tests 
it shouln't be too hard. I'll try, thanks for the pointer.

> Btw, I think not munging this new case into the existing CONVERT_EXPR_P
> code would be better - it makes the code (even) harder to understand and
> I'm not convinced that adding FLOAT_EXPR/FIX_TRUNC_EXPR handling
> does not wreck any assumptions in that code.

Ok.

> It also seems that for DECIMAL_FLOAT_TYPE_P the transform is always
> valid?

What do you mean? I just tested and my patch does the simplification for 
int->_Decimal128->int but not int->_Decimal32->int, which makes sense to 
me as _Decimal32 can't represent exactly every int. For _Decimal64, it 
doesn't do the simplification although it could, because I didn't multiply 
by log2(base), as I said above. I also thought of combining in vrp, where 
the range information could make more simplifications valid, but that 
seemed harder.

Patch

Index: tree-ssa-forwprop.c

===================================================================
--- tree-ssa-forwprop.c	(revision 186761)

+++ tree-ssa-forwprop.c	(working copy)

@@ -2403,10 +2403,11 @@  combine_conversions (gimple_stmt_iterato

 {
   gimple stmt = gsi_stmt (*gsi);
   gimple def_stmt;
   tree op0, lhs;
   enum tree_code code = gimple_assign_rhs_code (stmt);
+  enum tree_code code2;

 
   gcc_checking_assert (CONVERT_EXPR_CODE_P (code)
 		       || code == FLOAT_EXPR
 		       || code == FIX_TRUNC_EXPR);
 
@@ -2423,11 +2424,13 @@  combine_conversions (gimple_stmt_iterato

 
   def_stmt = SSA_NAME_DEF_STMT (op0);
   if (!is_gimple_assign (def_stmt))
     return 0;
 
-  if (CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt)))

+  code2 = gimple_assign_rhs_code (def_stmt);

+  if (CONVERT_EXPR_CODE_P (code2) || code2 == FLOAT_EXPR

+      || code2 == FIX_TRUNC_EXPR)

     {
       tree defop0 = gimple_assign_rhs1 (def_stmt);
       tree type = TREE_TYPE (lhs);
       tree inside_type = TREE_TYPE (defop0);
       tree inter_type = TREE_TYPE (op0);
@@ -2453,13 +2456,16 @@  combine_conversions (gimple_stmt_iterato

       /* In addition to the cases of two conversions in a row
 	 handled below, if we are converting something to its own
 	 type via an object of identical or wider precision, neither
 	 conversion is needed.  */
       if (useless_type_conversion_p (type, inside_type)
-	  && (((inter_int || inter_ptr) && final_int)

-	      || (inter_float && final_float))

-	  && inter_prec >= final_prec)

+	  && (((((inter_int || inter_ptr) && final_int)

+	        || (inter_float && final_float))

+	       && inter_prec >= final_prec)

+	      || (inside_int && final_int && inter_float

+	          && REAL_MODE_FORMAT (TYPE_MODE (inter_type))->p

+		     >= (int) inside_prec - !inside_unsignedp)))

 	{
 	  gimple_assign_set_rhs1 (stmt, unshare_expr (defop0));
 	  gimple_assign_set_rhs_code (stmt, TREE_CODE (defop0));
 	  update_stmt (stmt);
 	  return remove_prop_source_from_use (op0) ? 2 : 1;