diff mbox series

[fortran] Fix PR 82567

Message ID 359f2531-46d3-b2ce-666b-1267d28c1303@netcologne.de
State New
Headers show
Series [fortran] Fix PR 82567 | expand

Commit Message

Thomas Koenig Oct. 17, 2017, 10:36 p.m. UTC
Hello world,

this patch fixes a regression with long compile times,
which came about due to our handling of array constructors
at compile time.  This, togeteher with a simplification in
front end optimization, led to long compile times and large
code.

Regression-tested. OK for trunk and the other affected branches?

Regards

	Thomas

2917-10-17  Thomas Koenig  <tkoenig@gcc.gnu.org>

         PR fortran/82567
         * frontend-passes.c (combine_array_constructor): If an array
         constructor is all constants and has more elements than a small
         constant, don't convert a*[b,c] to [a*b,a*c] to reduce compilation
         times.

2917-10-17  Thomas Koenig  <tkoenig@gcc.gnu.org>

         PR fortran/82567
         * gfortran.dg/array_constructor_51.f90: New test.

Comments

Jerry DeLisle Oct. 18, 2017, 1:14 a.m. UTC | #1
On 10/17/2017 03:36 PM, Thomas Koenig wrote:
> Hello world,
> 
> this patch fixes a regression with long compile times,
> which came about due to our handling of array constructors
> at compile time.  This, togeteher with a simplification in
> front end optimization, led to long compile times and large
> code.
> 
> Regression-tested. OK for trunk and the other affected branches?
> 

Well I know 42 is the answer to the ultimate question of the universe so this
must be OK.  I just don't know what the question is.

OK and thanks,

Jerry

+#define CONSTR_LEN_MAX 42
+
Steve Kargl Oct. 18, 2017, 2:05 a.m. UTC | #2
On Tue, Oct 17, 2017 at 06:14:16PM -0700, Jerry DeLisle wrote:
> On 10/17/2017 03:36 PM, Thomas Koenig wrote:
> > Hello world,
> > 
> > this patch fixes a regression with long compile times,
> > which came about due to our handling of array constructors
> > at compile time.  This, togeteher with a simplification in
> > front end optimization, led to long compile times and large
> > code.
> > 
> > Regression-tested. OK for trunk and the other affected branches?
> > 
> 
> Well I know 42 is the answer to the ultimate question of the universe so this
> must be OK.  I just don't know what the question is.
> 
> OK and thanks,
> 
> Jerry
> 
> +#define CONSTR_LEN_MAX 42

Actually, I was wondering about the choice myself.  With
most common hardware having fairly robust L1 and L2 cache
sizes, a double precision array constructor with 42 
elements only occupies 336 bytes.  Seems small.
Mikael Morin Oct. 18, 2017, 11:40 a.m. UTC | #3
Le 18/10/2017 à 04:05, Steve Kargl a écrit :
> On Tue, Oct 17, 2017 at 06:14:16PM -0700, Jerry DeLisle wrote:
>> On 10/17/2017 03:36 PM, Thomas Koenig wrote:
>>> Hello world,
>>>
>>> this patch fixes a regression with long compile times,
>>> which came about due to our handling of array constructors
>>> at compile time.  This, togeteher with a simplification in
>>> front end optimization, led to long compile times and large
>>> code.
>>>
>>> Regression-tested. OK for trunk and the other affected branches?
>>>
>>
>> Well I know 42 is the answer to the ultimate question of the universe so this
>> must be OK.  I just don't know what the question is.
>>
>> OK and thanks,
>>
>> Jerry
>>
>> +#define CONSTR_LEN_MAX 42
> 
> Actually, I was wondering about the choice myself.  With
> most common hardware having fairly robust L1 and L2 cache
> sizes, a double precision array constructor with 42
> elements only occupies 336 bytes.  Seems small.
> 
There is a -fmax-array-constructor=n option. Can’t we use it for the limit?
Thomas Koenig Oct. 18, 2017, 9:32 p.m. UTC | #4
Hi Jerry and Steve,

>> Well I know 42 is the answer to the ultimate question of the universe so this
>> must be OK.  I just don't know what the question is.
>>
>> OK and thanks,
>>
>> Jerry
>>
>> +#define CONSTR_LEN_MAX 42
> Actually, I was wondering about the choice myself.  With
> most common hardware having fairly robust L1 and L2 cache
> sizes, a double precision array constructor with 42
> elements only occupies 336 bytes.  Seems small.

Well, the answer is that I didn't know how to chose a reasonable
constant.  I now actually ran some benchmarks using rdtsc, and
these seem to indicate that the optimum value for CONST_LEN_MAX
is actually quite short, 3 or 4, otherwise I just got a slowdown
or a break even.

So, I committed (r253872) with a length of 4 as a limit.  If anybody
comes up with a better number, we can always change this.

So, thanks for the review and the comments.

Regards

	Thomas

If somebody wants to check, here is the test case:

main.f90:

module tick
   interface
      function rdtsc()
        integer(kind=8) :: rdtsc
      end function rdtsc
   end interface
end module tick

program main
   use tick
   use tst
   implicit none
   integer(8) :: t1, t2
   t1 = rdtsc()
   call sub1(2.0)
   t2 = rdtsc()
   !  print *,"sub1 : ", t2-t1

   t1 = rdtsc()
   do i=1,10000
      call sub1(2.0)
   end do
   t2 = rdtsc()
   print *,"sub1 : ", t2-t1

   t1 = rdtsc()
   do i=1,10000
      call sub2(2.0)
   end do
   t2 = rdtsc()
   print *,"sub2 : ", t2-t1

end program main

tst.f90:
module tst
   integer, parameter :: n=4
   real, dimension(n) :: x
   real, dimension(n), parameter :: s = [(i,i=1,n)]
contains
   subroutine sub1(a)
     real, intent(in) :: a
     x(1) = a * 1.0
     x(2) = a * 2.0
     x(3) = a * 3.0
     x(4) = a * 3.0

   end subroutine sub1
   subroutine sub2(a)
     x(:) = a * s(:)
   end subroutine sub2
end module tst

rdtsc.s:
         .file   "rdtsc.s"
         .text
         .globl  rdtsc_
         .type   rdtsc_, @function
rdtsc_:
.LFB0:
         .cfi_startproc
         rdtsc
         shl     $32, %rdx
         or      %rdx, %rax
         ret
         .cfi_endproc
.LFE0:
         .size   rdtsc_, .-rdtsc_
         .section        .note.GNU-stack,"",@progbits
diff mbox series

Patch

Index: frontend-passes.c
===================================================================
--- frontend-passes.c	(Revision 253768)
+++ frontend-passes.c	(Arbeitskopie)
@@ -1635,6 +1635,8 @@  combine_array_constructor (gfc_expr *e)
   gfc_constructor *c, *new_c;
   gfc_constructor_base oldbase, newbase;
   bool scalar_first;
+  int n_elem;
+  bool all_const;
 
   /* Array constructors have rank one.  */
   if (e->rank != 1)
@@ -1674,12 +1676,38 @@  combine_array_constructor (gfc_expr *e)
   if (op2->ts.type == BT_CHARACTER)
     return false;
 
-  scalar = create_var (gfc_copy_expr (op2), "constr");
+  /* This might be an expanded constructor with very many constant values. If
+     we perform the operation here, we might end up with a long compile time,
+     so an arbitrary length bound is in order here.  If the constructor
+     constains something which is not a constant, it did not come from an
+     expansion, so leave it alone.  */
 
+#define CONSTR_LEN_MAX 42
+
   oldbase = op1->value.constructor;
+
+  n_elem = 0;
+  all_const = true;
+  for (c = gfc_constructor_first (oldbase); c; c = gfc_constructor_next(c))
+    {
+      if (c->expr->expr_type != EXPR_CONSTANT)
+	{
+	  all_const = false;
+	  break;
+	}
+      n_elem += 1;
+    }
+
+  if (all_const && n_elem > CONSTR_LEN_MAX)
+    return false;
+
+#undef CONSTR_LEN_MAX
+
   newbase = NULL;
   e->expr_type = EXPR_ARRAY;
 
+  scalar = create_var (gfc_copy_expr (op2), "constr");
+
   for (c = gfc_constructor_first (oldbase); c;
        c = gfc_constructor_next (c))
     {