Patchwork [fortran,4.9] Improve efficiency of array constructor operators

login
register
mail settings
Submitter Thomas Koenig
Date March 24, 2013, 9:46 p.m.
Message ID <514F743D.6020704@netcologne.de>
Download mbox | patch
Permalink /patch/230495/
State New
Headers show

Comments

Thomas Koenig - March 24, 2013, 9:46 p.m.
Hello world,

this updated patch fixes a regression in my previous patch,
with a test case for that regression also attached.

Regression-tested.

OK for trunk?

	Thomas

2013-03-24  Thomas Koenig  <tkoenig@gcc.gnu.org>

         PR fortran/55806
         * frontend-passes.c (optimize_code):  Keep track of
         current code to make code insertion possible.
         (combine_array_constructor):  New function.
         (optimize_op):  Call it.

2013-03-24  Thomas Koenig  <tkoenig@gcc.gnu.org>

         PR fortran/55806
         * gfortran.dg/array_constructor_43.f90:  New test.
         * gfortran.dg/random_seed_3.f90:  New test.
Tobias Burnus - March 28, 2013, 6:34 p.m.
On 28.03.2013 17:11, Thomas Koenig wrote:
> Am 26.03.2013 18:17, schrieb Thomas Koenig:
>> Hi Tobias,
>>
>>> I have not yet looked at the patch, but I wonder whether that causes
>>> invalid code for
>>> if (any([a,b,c] < f()))
>>> by evaluating f() multiple times.
>>
>> This is avoided by this part of the patch: [...] which results in 
>> code like this:
>>
>> __var_1 = f ();
>> if ((a > __var_1 || b > __var_1) || c > __var_1)
>
> Are there any other concerns, or can I commit this?

Indeed, the code looks fine. OK for the trunk - see nits below. Thanks 
for the patch!


On 24.03.2013 22:46, Thomas Koenig wrote:
> +}
> +
>   /* Recursive optimization of operators.  */
Another empty line missing.

Tobias

PS: Regarding the string comparison patch: As Janne, think it should be 
fine with also supporting kind=4 (don't forget to multiply the length by 
the kind, e.g. ts.kind or more properly (but equivalently) 
gfc_character_kinds[<index>].bit_size/BITS_PER_BYTE).
As follow up, I think you could do what Ondřej suggested.
Thomas Koenig - March 28, 2013, 9:33 p.m.
Hi Tobias,

I have committed the two approved patches, with the issues that you
have noted fixed.

Thanks a lot for the reviews!

> PS: Regarding the string comparison patch: As Janne, think it should be
> fine with also supporting kind=4 (don't forget to multiply the length by
> the kind, e.g. ts.kind or more properly (but equivalently)
> gfc_character_kinds[<index>].bit_size/BITS_PER_BYTE).

I'll do that.

> As follow up, I think you could do what Ondřej suggested.

I will also take a look at that.

Regards

	Thomas
Thomas Koenig - March 29, 2013, 1:42 p.m.
I wrote:

>> PS: Regarding the string comparison patch: As Janne, think it should be
>> fine with also supporting kind=4 (don't forget to multiply the length by
>> the kind, e.g. ts.kind or more properly (but equivalently)
>> gfc_character_kinds[<index>].bit_size/BITS_PER_BYTE).
>
> I'll do that.

Oops, it turns out we can't do this, at least not on little-endian
systems.

Consider:

#include <stdio.h>
int main()
{
   unsigned int i,j;
   i = 0x102;
   j = 0x201;

   printf("%d %d\n", i>j, memcmp(&i, &j, sizeof(i)));
   return 0;
}

On a little-endian system, the output is

0 1

because i = 02 01 00 00 and j = 01 02 00 00 (in hex).

So, is the original patch OK for trunk?

	Thomas
Tobias Burnus - March 29, 2013, 2:56 p.m.
Am 29.03.2013 14:42, schrieb Thomas Koenig:
> I wrote:
>
>>> PS: Regarding the string comparison patch: As Janne, think it should be
>>> fine with also supporting kind=4 (don't forget to multiply the 
>>> length by
>>> the kind, e.g. ts.kind or more properly (but equivalently)
>>> gfc_character_kinds[<index>].bit_size/BITS_PER_BYTE).
>>
>> I'll do that.
>
> Oops, it turns out we can't do this, at least not on little-endian
> systems.

I somehow completely missed that "<" is permitted for characters.*  
Thanks for the reminder. Actually, we still can do use memcpy for "==" 
and "/=" - but admittedly not for <, <=, >= and >=.

Thus, how about using memcmp for kind=4 for == and /= only - and for 
kind=1 also for <, > etc.?


Side note: The elemental intrinsic functions lge, lgt, lle and llt work 
likewise. Thus, they could also profit from the same optimization. 
However, I do not know whether anyone one uses these. The difference 
between "<" and llt() seems to be that llt takes a default-kind (or 
F2008: also ASCII kind) and uses the ASCII collating sequence while < 
etc use the collating sequence. As our default-kind character is the 
ASCII character, the distinction does not matter.

Tobias

* For <, >= etc., see Fortran 2003, "7.2.3 Relational intrinsic 
operations", 7th paragraph; Fortran 2008, 9th paragraph of "7.1.5.5.1 
Interpretation of relational intrinsic operations"

Patch

Index: frontend-passes.c
===================================================================
--- frontend-passes.c	(Revision 196748)
+++ frontend-passes.c	(Arbeitskopie)
@@ -135,6 +135,10 @@  optimize_code (gfc_code **c, int *walk_subtrees AT
   else
     count_arglist = 0;
 
+  current_code = c;
+  inserted_block = NULL;
+  changed_statement = NULL;
+
   if (op == EXEC_ASSIGN)
     optimize_assignment (*c);
   return 0;
@@ -991,13 +995,97 @@  optimize_lexical_comparison (gfc_expr *e)
   return false;
 }
 
+/* Combine stuff like [a]>b into [a>b], for easier optimization later.  Do not
+   do CHARACTER because of possible pessimization involving character
+   lengths.  */
+
+static bool
+combine_array_constructor (gfc_expr *e)
+{
+
+  gfc_expr *op1, *op2;
+  gfc_expr *scalar;
+  gfc_expr *new_expr;
+  gfc_constructor *c, *new_c;
+  gfc_constructor_base oldbase, newbase;
+  bool scalar_first;
+
+  /* Array constructors have rank one.  */
+  if (e->rank != 1)
+    return false;
+
+  op1 = e->value.op.op1;
+  op2 = e->value.op.op2;
+
+  if (op1->expr_type == EXPR_ARRAY && op2->rank == 0)
+    scalar_first = false;
+  else if (op2->expr_type == EXPR_ARRAY && op1->rank == 0)
+    {
+      scalar_first = true;
+      op1 = e->value.op.op2;
+      op2 = e->value.op.op1;
+    }
+  else
+    return false;
+
+  if (op2->ts.type == BT_CHARACTER)
+    return false;
+
+  if (op2->expr_type == EXPR_CONSTANT)
+    scalar = gfc_copy_expr (op2);
+  else
+    scalar = create_var (gfc_copy_expr (op2));
+
+  oldbase = op1->value.constructor;
+  newbase = NULL;
+  e->expr_type = EXPR_ARRAY;
+
+  c = gfc_constructor_first (oldbase);
+
+  for (c = gfc_constructor_first (oldbase); c;
+       c = gfc_constructor_next (c))
+    {
+      new_expr = gfc_get_expr ();
+      new_expr->ts = e->ts;
+      new_expr->expr_type = EXPR_OP;
+      new_expr->rank = c->expr->rank;
+      new_expr->where = c->where;
+      new_expr->value.op.op = e->value.op.op;
+
+      if (scalar_first)
+	{
+	  new_expr->value.op.op1 = gfc_copy_expr (scalar);
+	  new_expr->value.op.op2 = gfc_copy_expr (c->expr);
+	}
+      else
+	{
+	  new_expr->value.op.op1 = gfc_copy_expr (c->expr);
+	  new_expr->value.op.op2 = gfc_copy_expr (scalar);
+	}
+
+      new_c = gfc_constructor_append_expr (&newbase, new_expr, &(e->where));
+      new_c->iterator = c->iterator;
+      c->iterator = NULL;
+    }
+
+  gfc_free_expr (op1);
+  gfc_free_expr (op2);
+
+  e->value.constructor = newbase;
+  return true;
+}
+
 /* Recursive optimization of operators.  */
 
 static bool
 optimize_op (gfc_expr *e)
 {
+  bool changed;
+
   gfc_intrinsic_op op = e->value.op.op;
 
+  changed = false;
+
   /* Only use new-style comparisons.  */
   switch(op)
     {
@@ -1037,8 +1125,16 @@  optimize_op (gfc_expr *e)
     case INTRINSIC_NE:
     case INTRINSIC_GT:
     case INTRINSIC_LT:
-      return optimize_comparison (e, op);
+      changed = optimize_comparison (e, op);
 
+      /* Fall through */
+      /* Look at array constructors.  */
+    case INTRINSIC_PLUS:
+    case INTRINSIC_MINUS:
+    case INTRINSIC_TIMES:
+    case INTRINSIC_DIVIDE:
+      return combine_array_constructor (e) || changed;
+
     default:
       break;
     }