[fortran] Handling of .and. and .or. expressions

Message ID e2698932-14bb-e7e8-c9dd-6b59be1e39db@netcologne.de
State New
Headers show
Series
  • [fortran] Handling of .and. and .or. expressions
Related show

Commit Message

Thomas Koenig June 11, 2018, 7:22 p.m.
Hello world,

the attached patch introduces the following changes:

If a logical .and. or .or. expression contains a reference to a function
which is impure and which also does not behave like a pure function
(i.e. does not have the implicit_pure attribute set), it emits a
warning with -Wsurprising that the function might not be evaluated.
(-Wsurprising is enabled by -Wall).

It special cases the idiom  if (associated(m) .and. m%t) which
people appear to use.

And, if there is an expression like   func() .and. flag , it
reverses the test as an optimization. The middle end should be
capable of doing this, but apparently it doesn't, so the front
end might as well do this.

What it does not do is one part of PR 57160, i.e. warn against
if (a /= 0 .and. 1/a > 5) which people who are used to C might
also like to write.

There is already quite some discussion in the PRs, especially 85599,
where not all people were of the same opinion. Let us see where the
discussion here leads us.

Regression-tested (which found one bug in the testsuite).

OK for trunk?

Regards

	Thomas

2018-06-11  Thomas Koenig  <tkoenig@gcc.gnu.org>

         PR fortran/57160
         PR fortran/85599
         * dump-parse-tree (show_attr): Add handling of implicit_pure.
         * resolve.c (impure_function_callback): New function.
         (resolve_operator): Call it vial gfc_expr_walker. Special-case
         if (associated(m) .and. m%t).  If an .and. or .or. expression
         has a function or a non-function, exchange the operands.

2018-06-11  Thomas Koenig  <tkoenig@gcc.gnu.org>

         PR fortran/57160
         PR fortran/85599
         * gfortran.dg/logical_evaluation_1.f90: New test.
         * gfortran.dg/alloc_comp_default_init_2.f90: Fix code which
         implicitly depends on short-circuiting.

Comments

Janus Weil June 13, 2018, 12:20 p.m. | #1
Hi Thomas,


>the attached patch introduces the following changes:

thanks a lot for working on this! 


>If a logical .and. or .or. expression contains a reference to a
>function
>which is impure and which also does not behave like a pure function
>(i.e. does not have the implicit_pure attribute set), it emits a
>warning with -Wsurprising that the function might not be evaluated.
>(-Wsurprising is enabled by -Wall).

I think you are warning about too many cases here. Warning about an impure function as the *second* operand of the logical operators should be enough, I think. AFAICS there is no danger of the first operand not being evaluated, right?


>It special cases the idiom  if (associated(m) .and. m%t) which
>people appear to use.

I don't fully understand why you do this, but in any case it should not be necessary if you warn about the second operand only.


>And, if there is an expression like   func() .and. flag , it
>reverses the test as an optimization.

I really don't like this part. It actually introduces more problems of the sort that we're trying to warn about ...

Cheers,
Janus



>2018-06-11  Thomas Koenig  <tkoenig@gcc.gnu.org>
>
>         PR fortran/57160
>         PR fortran/85599
>         * dump-parse-tree (show_attr): Add handling of implicit_pure.
>         * resolve.c (impure_function_callback): New function.
>         (resolve_operator): Call it vial gfc_expr_walker. Special-case
>         if (associated(m) .and. m%t).  If an .and. or .or. expression
>         has a function or a non-function, exchange the operands.
>
>2018-06-11  Thomas Koenig  <tkoenig@gcc.gnu.org>
>
>         PR fortran/57160
>         PR fortran/85599
>         * gfortran.dg/logical_evaluation_1.f90: New test.
>         * gfortran.dg/alloc_comp_default_init_2.f90: Fix code which
>         implicitly depends on short-circuiting.
Thomas Koenig June 13, 2018, 8:39 p.m. | #2
Am 13.06.2018 um 14:20 schrieb Janus Weil:
> Hi Thomas,
> 
> 
>> the attached patch introduces the following changes:
> 
> thanks a lot for working on this!
> 
> 
>> If a logical .and. or .or. expression contains a reference to a
>> function
>> which is impure and which also does not behave like a pure function
>> (i.e. does not have the implicit_pure attribute set), it emits a
>> warning with -Wsurprising that the function might not be evaluated.
>> (-Wsurprising is enabled by -Wall).
> 
> I think you are warning about too many cases here. Warning about an impure function as the *second* operand of the logical operators should be enough, I think. AFAICS there is no danger of the first operand not being evaluated, right?

Not necessarily.  The middle end, at the moment, may
lack this optimization; at least nobody has so far come
up with a test case that demonstrates otherwise.  On the
other hand, people who know the middle end extremely well have
indicated that they expect optimiazations to happen with
TRUTH_AND_EXPR, so I am far from certaint that this behavior
will continue.

> 
>> It special cases the idiom  if (associated(m) .and. m%t) which
>> people appear to use.
> 
> I don't fully understand why you do this, but in any case it should not be necessary if you warn about the second operand only.

Well, it is an easy mistake to make, and it is (as above) liable
to break at any time.

Also ASSOCIATED (and ALLOCATED) are pure, so normally I would
not warn on these.

>> And, if there is an expression like   func() .and. flag , it
>> reverses the test as an optimization.
> 
> I really don't like this part. It actually introduces more problems of the sort that we're trying to warn about ...

If we actually perform this operation, we will have warned about this
before (with -Wsurprising).

And of course, the compiler can always reverse the order...

Regards

	Thomas
Steve Kargl June 13, 2018, 8:42 p.m. | #3
On Mon, Jun 11, 2018 at 09:22:27PM +0200, Thomas Koenig wrote:
> 
> Regression-tested (which found one bug in the testsuite).
> 
> OK for trunk?
> 

I fine with the intent of the patch (see below).

PS: I think there may be some confusion on what IMPURE implies.

> Index: fortran/resolve.c
> ===================================================================
> --- fortran/resolve.c	(Revision 261388)
> +++ fortran/resolve.c	(Arbeitskopie)
> @@ -3807,7 +3807,43 @@ lookup_uop_fuzzy (const char *op, gfc_symtree *uop
>    return gfc_closest_fuzzy_match (op, candidates);
>  }
>  
> +/* Callback finding an impure function as an operand to an .and. or
> +   .or.  expression.  Remember the last function warned about to
> +   avoid double warnings when recursing.  */
>  
> +static int
> +impure_function_callback (gfc_expr **e, int *walk_subtrees ATTRIBUTE_UNUSED,
> +			  void *data)
> +{
> +  gfc_expr *f = *e;
> +  const char *name;
> +  static gfc_expr *last = NULL;
> +  bool *found = (bool *) data;
> +
> +  if (f->expr_type == EXPR_FUNCTION)
> +    {
> +      *found = 1;

Why not true?  *found is declared to be bool.

> +	      if (name)
> +		gfc_warning (OPT_Wsurprising, "Impure function %qs at %L "
> +			     "might not be evaluated", name, &f->where);
> +	      else
> +		gfc_warning (OPT_Wsurprising, "Impure function at %L "
> +			     "might not be evaluated", &f->where);

I think that this can simply be "Function %qs at %L may not be evaluated"

> +	      /* Some people code which depends on the short-circuiting that
> +		 Fortran does not provide, such as

The above seems a little muddled.  I suggest a shorter comment.
"Some programmers assume that Fortran supports short-circuiting in logical
 expression, which can lead to surprising behavior.  For example, in the
 following 

> +
> +		 if (associated(m) .and. m%t) then

 m%t may be evaluated before associated(m).

> +		 So, warn about this idiom. However, avoid breaking
> +		 it on purpose.  */
> +
> +	      if (op1->expr_type == EXPR_FUNCTION && op1->value.function.isym
> +		  && op1->value.function.isym->id == GFC_ISYM_ASSOCIATED)
> +		{
> +		  gfc_expr *e = op1->value.function.actual->expr;
> +		  gfc_expr *en = op1->value.function.actual->next->expr;
> +		  if (en == NULL && gfc_check_dependency (e, op2, true))
> +		    {
> +		      gfc_warning (OPT_Wsurprising, "%qs function call at %L does "
> +				   "not guard expression at %L", "ASSOCIATED",
> +				   &op1->where, &op2->where);
> +		      dont_move = true;
> +		    }
> +		}
> +
> +	      /* A bit of optimization: Transfer if (f(x) .and. flag)
> +		 into if (flag .and. f(x)), to save evaluation of a

s/transfer/transform

I would also put quotes around the Fortran code.

> +		 function.  The middle end should be capable of doing
> +		 this with a TRUTH_AND_EXPR, but it currently does not do
> +		 so. See PR 85599.  */

The rest looks ok, but I'm not sure if this addresses Janus'
concerns.
Janne Blomqvist June 14, 2018, 8:05 a.m. | #4
On Mon, Jun 11, 2018 at 10:22 PM, Thomas Koenig <tkoenig@netcologne.de>
wrote:

> Hello world,
>
> the attached patch introduces the following changes:
>
> If a logical .and. or .or. expression contains a reference to a function
> which is impure and which also does not behave like a pure function
> (i.e. does not have the implicit_pure attribute set), it emits a
> warning with -Wsurprising that the function might not be evaluated.
> (-Wsurprising is enabled by -Wall).
>
> It special cases the idiom  if (associated(m) .and. m%t) which
> people appear to use.
>
> And, if there is an expression like   func() .and. flag , it
> reverses the test as an optimization. The middle end should be
> capable of doing this, but apparently it doesn't, so the front
> end might as well do this.
>

What about more complicated expressions like, say, "func() .and. flag .and
func2() .and. flag2" etc.? Can it move all the function calls to the end?


>
> What it does not do is one part of PR 57160, i.e. warn against
> if (a /= 0 .and. 1/a > 5) which people who are used to C might
> also like to write.
>
> There is already quite some discussion in the PRs, especially 85599,
> where not all people were of the same opinion. Let us see where the
> discussion here leads us.
>


IMHO it's a mistake that the standard refuses to specify the evaluation
strategy in the name of some rather minor hypothetical performance benefit.
Either

a) short-circuiting in left-to-right order

or

b) must evaluate all the arguments in left-to-right order

My preference would be for a) as that is what most languages are doing, but
even b) would be better than leaving it undefined.

Also, there are AFAIU other similar weirdness with impure functions. The
standard allows a compiler to optimize

y = f(x) + f(x)

into

y = 2 * f(x)

even if f is impure, which is totally bonkers. Or even not call f at all,
if the compiler determines that y is not needed.
Janus Weil June 14, 2018, 8:38 a.m. | #5
Am 14. Juni 2018 10:05:24 MESZ schrieb Janne Blomqvist <blomqvist.janne@gmail.com>:
>> There is already quite some discussion in the PRs, especially 85599,
>> where not all people were of the same opinion. Let us see where the
>> discussion here leads us.
>
>IMHO it's a mistake that the standard refuses to specify the evaluation
>strategy in the name of some rather minor hypothetical performance
>benefit.

+1

I think it's pretty much the worst strategy the standard can take, and it may even harm optimization in the end.

Optimization is always a process that involves both the programmer and the compiler. If the programmer does not know what the compiler can and will do with his code in the end, it's hard to write 'good' code (in the sense that it does what was intended, and with good performance).


>Either
>
>a) short-circuiting in left-to-right order
>
>or
>
>b) must evaluate all the arguments in left-to-right order
>
>My preference would be for a) as that is what most languages are doing,

Well, C/C++ has two different kinds of operators that give you the choice to specify whether you want short-circuiting or not.


>but
>even b) would be better than leaving it undefined.

Agreed.

In my opinion the best strategy for gfortran in the current situation would be to apply short-circuiting whenever it can be proven that it has no observable consequences. E.g. a function call should not be optimized away unless it is pure.


>Also, there are AFAIU other similar weirdness with impure functions.
>The
>standard allows a compiler to optimize
>
>y = f(x) + f(x)
>
>into
>
>y = 2 * f(x)
>
>even if f is impure, which is totally bonkers. Or even not call f at
>all,
>if the compiler determines that y is not needed.

Yes, that is the same kind of craziness. I hope gfortran does not actually do this?

Cheers,
Janus
Janne Blomqvist June 14, 2018, 9:41 a.m. | #6
On Thu, Jun 14, 2018 at 11:38 AM, Janus Weil <jaydub66@gmail.com> wrote:

>
> Am 14. Juni 2018 10:05:24 MESZ schrieb Janne Blomqvist <
> blomqvist.janne@gmail.com>:
>
> >Either
> >
> >a) short-circuiting in left-to-right order
> >
> >or
> >
> >b) must evaluate all the arguments in left-to-right order
> >
> >My preference would be for a) as that is what most languages are doing,
>
> Well, C/C++ has two different kinds of operators that give you the choice
> to specify whether you want short-circuiting or not.
>

Presumably you're thinking of "&" and "|". But to be pedantic, those are
bitwise operators, not non-short-circuiting boolean operators, although in
some cases they can be used as such. There are however situations where
they are not equivalent to hypothetical non-short-circuiting boolean
operators!

>but
> >even b) would be better than leaving it undefined.
>
> Agreed.
>
> In my opinion the best strategy for gfortran in the current situation
> would be to apply short-circuiting whenever it can be proven that it has no
> observable consequences. E.g. a function call should not be optimized away
> unless it is pure.
>

Hmm, why? I don't see why always evaluating everything is less wrong than
short-circuiting? I think the best we can do is to print a warning, and try
to pressure the committee to choose an evaluation strategy so that this
unfortunate situation could eventually be resolved (is Toon still on the
committée?).


> >Also, there are AFAIU other similar weirdness with impure functions.
> >The
> >standard allows a compiler to optimize
> >
> >y = f(x) + f(x)
> >
> >into
> >
> >y = 2 * f(x)
> >
> >even if f is impure, which is totally bonkers. Or even not call f at
> >all,
> >if the compiler determines that y is not needed.
>
> Yes, that is the same kind of craziness. I hope gfortran does not actually
> do this?
>

I would hope so too, but I haven't verified.
Janus Weil June 14, 2018, 9:44 a.m. | #7
Am 13. Juni 2018 22:39:38 MESZ schrieb Thomas Koenig <tkoenig@netcologne.de>:
>>> If a logical .and. or .or. expression contains a reference to a
>>> function
>>> which is impure and which also does not behave like a pure function
>>> (i.e. does not have the implicit_pure attribute set), it emits a
>>> warning with -Wsurprising that the function might not be evaluated.
>>> (-Wsurprising is enabled by -Wall).
>> 
>> I think you are warning about too many cases here. Warning about an
>impure function as the *second* operand of the logical operators should
>be enough, I think. AFAICS there is no danger of the first operand not
>being evaluated, right?
>
>Not necessarily.  The middle end, at the moment, may
>lack this optimization; at least nobody has so far come
>up with a test case that demonstrates otherwise.  On the
>other hand, people who know the middle end extremely well have
>indicated that they expect optimiazations to happen with
>TRUTH_AND_EXPR, so I am far from certaint that this behavior
>will continue.

IIUC, TRUTH_AND_EXPR corresponds to C's && operator, which is inherently assymetric and guaranteed to short-circuit. So I don't think the middle-end would ever switch operands here. I hope your ominous "people who know the middle end extremely well" will correct me if I'm wrong :)


>>> It special cases the idiom  if (associated(m) .and. m%t) which
>>> people appear to use.
>> 
>> I don't fully understand why you do this, but in any case it should
>not be necessary if you warn about the second operand only.
>
>Well, it is an easy mistake to make, and it is (as above) liable
>to break at any time.

You still haven't stated clearly the rationale for this warning. As I see it, Fortran does not require the compiler to do short-circuiting, but gfortran still does it. So AFAICS there is currently no reason to throw a warning on this case, right?

I don't think we need warnings for hypothetical situations. In fact, ifort might have reason to warn about this case, because it does not do short-circuiting.


>>> And, if there is an expression like   func() .and. flag , it
>>> reverses the test as an optimization.
>> 
>> I really don't like this part. It actually introduces more problems
>of the sort that we're trying to warn about ...
>
>If we actually perform this operation, we will have warned about this
>before (with -Wsurprising).

But we do not perform this operation at present, and I really don't think we should.


>And of course, the compiler can always reverse the order...

If you're talking about the middle-end here, I don't think it can (see discussion above).

Cheers,
Janus
Jakub Jelinek June 14, 2018, 9:55 a.m. | #8
On Wed, Jun 13, 2018 at 10:39:38PM +0200, Thomas Koenig wrote:
> Am 13.06.2018 um 14:20 schrieb Janus Weil:
> > Hi Thomas,
> > 
> > 
> > > the attached patch introduces the following changes:
> > 
> > thanks a lot for working on this!
> > 
> > 
> > > If a logical .and. or .or. expression contains a reference to a
> > > function
> > > which is impure and which also does not behave like a pure function
> > > (i.e. does not have the implicit_pure attribute set), it emits a
> > > warning with -Wsurprising that the function might not be evaluated.
> > > (-Wsurprising is enabled by -Wall).
> > 
> > I think you are warning about too many cases here. Warning about an impure function as the *second* operand of the logical operators should be enough, I think. AFAICS there is no danger of the first operand not being evaluated, right?
> 
> Not necessarily.  The middle end, at the moment, may
> lack this optimization; at least nobody has so far come
> up with a test case that demonstrates otherwise.  On the
> other hand, people who know the middle end extremely well have
> indicated that they expect optimiazations to happen with
> TRUTH_AND_EXPR, so I am far from certaint that this behavior
> will continue.

So what exactly is the purpose of the warning?  Teach users that
think .and./.or. in Fortran works like C/C++ &&/|| that no, that is not the
case, and that it actually works instead like &/| ?
In that case it makes sense to warn only about side-effects in the second
operand that they will be called even if the first argument is .false. .
Or warn users that there is no evaluation ordering between the first and
second operand, that both operands are evaluated and it is unspecified which
is evaluated first?  Wouldn't you then just warn all the time?  Even without
any impure procedures, you could have
l = associated (m)
if (l .and. m%t)
or similar and m%t could be evaluated before l.

I bet gfortran evaluates the side-effects left-to-right and evaluates both
arguments always, right?

	Jakub
Janus Weil June 14, 2018, 10:14 a.m. | #9
Am 14. Juni 2018 11:41:03 MESZ schrieb Janne Blomqvist <blomqvist.janne@gmail.com>:
>On Thu, Jun 14, 2018 at 11:38 AM, Janus Weil <jaydub66@gmail.com>
>wrote:
>
>>
>> Am 14. Juni 2018 10:05:24 MESZ schrieb Janne Blomqvist <
>> blomqvist.janne@gmail.com>:
>>
>> >Either
>> >
>> >a) short-circuiting in left-to-right order
>> >
>> >or
>> >
>> >b) must evaluate all the arguments in left-to-right order
>> >
>> >My preference would be for a) as that is what most languages are
>doing,
>>
>> Well, C/C++ has two different kinds of operators that give you the
>choice
>> to specify whether you want short-circuiting or not.
>
>Presumably you're thinking of "&" and "|". But to be pedantic, those
>are
>bitwise operators, not non-short-circuiting boolean operators

Yes, but they work perfectly fine as logical operators on booleans, don't they?


>There are however situations where
>they are not equivalent to hypothetical non-short-circuiting boolean
>operators!

Could you give a simple example of such a case?



>>but
>> >even b) would be better than leaving it undefined.
>>
>> Agreed.
>>
>> In my opinion the best strategy for gfortran in the current situation
>> would be to apply short-circuiting whenever it can be proven that it
>has no
>> observable consequences. E.g. a function call should not be optimized
>away
>> unless it is pure.
>
>Hmm, why? I don't see why always evaluating everything is less wrong
>than
>short-circuiting?

I'm not saying it is "less wrong", but it can be less efficient. In that sense my proposed strategy is a compromise that tries to avoid harmful optimizations but still does optimization where it is safe.

Cheers,
Janus
Janne Blomqvist June 14, 2018, 10:40 a.m. | #10
On Thu, Jun 14, 2018 at 1:14 PM, Janus Weil <janus@gcc.gnu.org> wrote:

>
>
> Am 14. Juni 2018 11:41:03 MESZ schrieb Janne Blomqvist <
> blomqvist.janne@gmail.com>:
> >On Thu, Jun 14, 2018 at 11:38 AM, Janus Weil <jaydub66@gmail.com>
> >wrote:
> >
> >>
> >> Am 14. Juni 2018 10:05:24 MESZ schrieb Janne Blomqvist <
> >> blomqvist.janne@gmail.com>:
> >>
> >> >Either
> >> >
> >> >a) short-circuiting in left-to-right order
> >> >
> >> >or
> >> >
> >> >b) must evaluate all the arguments in left-to-right order
> >> >
> >> >My preference would be for a) as that is what most languages are
> >doing,
> >>
> >> Well, C/C++ has two different kinds of operators that give you the
> >choice
> >> to specify whether you want short-circuiting or not.
> >
> >Presumably you're thinking of "&" and "|". But to be pedantic, those
> >are
> >bitwise operators, not non-short-circuiting boolean operators
>
> Yes, but they work perfectly fine as logical operators on booleans, don't
> they?
>
>
> >There are however situations where
> >they are not equivalent to hypothetical non-short-circuiting boolean
> >operators!
>
> Could you give a simple example of such a case?
>

With the usual representation of integers, for A=1, B=2, both A and B by
themselves will evaluate as true, A&&B will evaluate to true, but A&B will
evaluate to false.

>>but
> >> >even b) would be better than leaving it undefined.
> >>
> >> Agreed.
> >>
> >> In my opinion the best strategy for gfortran in the current situation
> >> would be to apply short-circuiting whenever it can be proven that it
> >has no
> >> observable consequences. E.g. a function call should not be optimized
> >away
> >> unless it is pure.
> >
> >Hmm, why? I don't see why always evaluating everything is less wrong
> >than
> >short-circuiting?
>
> I'm not saying it is "less wrong", but it can be less efficient. In that
> sense my proposed strategy is a compromise that tries to avoid harmful
> optimizations but still does optimization where it is safe.
>

If the user assumes short-circuiting, then evaluating everything is
surprising. If the user assumes everything will be evaluated,
short-circuiting is surprising. So whatever is done, somebody will think
we're doing it wrong. So in lieu of the standards committee getting their
act together on this issue, I suggest we do whatever is most efficient, and
generate a warning.
Steve Kargl June 14, 2018, 2:47 p.m. | #11
On Thu, Jun 14, 2018 at 11:55:45AM +0200, Jakub Jelinek wrote:
> 
> I bet gfortran evaluates the side-effects left-to-right and evaluates both
> arguments always, right?
> 

Currently, gfortran evaluates a logical expression left-to-right
due to its use of the middle-end's TRUTH_AND_EXPR.  The Fortran
standard does not require this left-to-right evaluation.

Janus found that with '.false. .and. check()' gfortran will not
evaluate check().  This is a valid Fortran optimization in that
'.false.' and '.false. .and. check()' are equivalent expessions
no matter what the return value of check() is.  It does not matter
if check() is PURE or IMPURE.  He then found with the operands
reversed that check() is evaluated.  He traced this behavior to 
TRUTH_AND_EXPR, where C allows short-circuiting and it performs
left-to-right evaluation.

I've already provided a sloppy patch in comment #33 of PR85599
that forces gfortran to evaluate both operands.  The patch is
equivalent to writing

tmp1 = .false.
tmp2 = check()
if (tmp1 .and. tmp2) ...

The patch is sloppy because this forced evaulation should be
enabled by -fno-short-circuit (or some such option).

Thomas' patch with a warning is simply meant to help programmers
who aren't familiar with the Fortran standard.  He's trying to
use PUREness as a way to surpress the warning.  A PURE function
by design does not have side-effects so eliminating the evaluation
of PURE function should be okay.  An IMPURE function or an unmarked
function may or may not have side effects.    

The moral of the story:  Don't use functions with side-effects.
Janus Weil June 15, 2018, 9:22 a.m. | #12
Am 14. Juni 2018 12:40:19 MESZ schrieb Janne Blomqvist <blomqvist.janne@gmail.com>:
>> >> >Either
>> >> >
>> >> >a) short-circuiting in left-to-right order
>> >> >
>> >> >or
>> >> >
>> >> >b) must evaluate all the arguments in left-to-right order
>> >> >
>> >> >My preference would be for a) as that is what most languages are
>> >doing,
>> >>
>> >> Well, C/C++ has two different kinds of operators that give you the
>> >choice
>> >> to specify whether you want short-circuiting or not.
>> >
>> >Presumably you're thinking of "&" and "|". But to be pedantic, those
>> >are
>> >bitwise operators, not non-short-circuiting boolean operators
>>
>> Yes, but they work perfectly fine as logical operators on booleans,
>don't
>> they?
>>
>>
>> >There are however situations where
>> >they are not equivalent to hypothetical non-short-circuiting boolean
>> >operators!
>>
>> Could you give a simple example of such a case?
>>
>
>With the usual representation of integers, for A=1, B=2, both A and B
>by
>themselves will evaluate as true, A&&B will evaluate to true, but A&B
>will
>evaluate to false.

Yes, sure, with integers you can play such tricks. But as long as you handle only boolean values (in Fortran: .true. and .false.), & and | are proper non-short-circuiting boolean operators.

So, to repeat my earlier statement: C/C++ gives you the choice for boolean expressions whether you want short-circuiting or not. No ambiguities, no confusion, no compiler-dependent code.

In Fortran, it still feels like functions as such are second-class citizens. People seriously advise against using them. Doesn't really help the attractivity of the language.


>>>but
>> >> >even b) would be better than leaving it undefined.
>> >>
>> >> Agreed.
>> >>
>> >> In my opinion the best strategy for gfortran in the current
>situation
>> >> would be to apply short-circuiting whenever it can be proven that
>it
>> >has no
>> >> observable consequences. E.g. a function call should not be
>optimized
>> >away
>> >> unless it is pure.
>> >
>> >Hmm, why? I don't see why always evaluating everything is less wrong
>> >than
>> >short-circuiting?
>>
>> I'm not saying it is "less wrong", but it can be less efficient. In
>that
>> sense my proposed strategy is a compromise that tries to avoid
>harmful
>> optimizations but still does optimization where it is safe.
>>
>
>If the user assumes short-circuiting, then evaluating everything is
>surprising. If the user assumes everything will be evaluated,
>short-circuiting is surprising. So whatever is done, somebody will
>think
>we're doing it wrong.

Strongly disagree here. My proposal is to apply short-circuiting only where there are absolutely no surprises. If a user writes

my_variable = .true. and my_pure_function()

then there is no way at all for him to check if the function was called or not (if the function is pure). And in fact it makes zero difference for the overall program flow and all results that are produced.

To not call the function here is an absolutely valid optimization to make, because it does not change any results, in contrast to the case of impure functions.

IMHO it is completely insane to optimize out impure functions, just for a little bit of speedup, but sacrificing compiler-independent results.

I really don't understand why I'm so alone here with this opinion.

Cheers,
Janus
Janus Weil June 15, 2018, 10:26 a.m. | #13
Am 15. Juni 2018 11:22:44 MESZ schrieb Janus Weil <janus@gcc.gnu.org>:
>>>>but
>>> >> >even b) would be better than leaving it undefined.
>>> >>
>>> >> Agreed.
>>> >>
>>> >> In my opinion the best strategy for gfortran in the current
>>situation
>>> >> would be to apply short-circuiting whenever it can be proven that
>>it
>>> >has no
>>> >> observable consequences. E.g. a function call should not be
>>optimized
>>> >away
>>> >> unless it is pure.
>>> >
>>> >Hmm, why? I don't see why always evaluating everything is less
>wrong
>>> >than
>>> >short-circuiting?
>>>
>>> I'm not saying it is "less wrong", but it can be less efficient. In
>>that
>>> sense my proposed strategy is a compromise that tries to avoid
>>harmful
>>> optimizations but still does optimization where it is safe.
>>>
>>
>>If the user assumes short-circuiting, then evaluating everything is
>>surprising. If the user assumes everything will be evaluated,
>>short-circuiting is surprising. So whatever is done, somebody will
>>think
>>we're doing it wrong.
>
>Strongly disagree here. My proposal is to apply short-circuiting only
>where there are absolutely no surprises. If a user writes
>
>my_variable = .true. and my_pure_function()

Sorry, of course this was meant to read:

my_variable = .false. .and. my_pure_function()

I'm currently travelling with a 5-inch screen only, which somewhat limits my patch-review capabilities ...

Cheers,
Janus


>then there is no way at all for him to check if the function was called
>or not (if the function is pure). And in fact it makes zero difference
>for the overall program flow and all results that are produced.
>
>To not call the function here is an absolutely valid optimization to
>make, because it does not change any results, in contrast to the case
>of impure functions.
>
>IMHO it is completely insane to optimize out impure functions, just for
>a little bit of speedup, but sacrificing compiler-independent results.
>
>I really don't understand why I'm so alone here with this opinion.
>
>Cheers,
>Janus
Thomas Koenig June 15, 2018, 5:06 p.m. | #14
Hi Janne,

> What about more complicated expressions like, say, "func() .and. flag .and
> func2() .and. flag2" etc.? Can it move all the function calls to the end?

Not at the moment.

Like many of the front-end optimizations, this aims for the easy 80%
which are relatively easy to achieve.

> IMHO it's a mistake that the standard refuses to specify the evaluation
> strategy in the name of some rather minor hypothetical performance benefit.

Well, it is the way it is, and it is not going to change. We might
as well make the best of the situation as it is.

There are other standards, such as the C standard, whose type-based
aliasing rules can be used for optimization, but can also introduce
hard-to-find bugs. gcc's philosophy is to use the optimization
possibilities, and I see no reason for gfortran to deviate from that.

Regards

	Thomas
Thomas Koenig June 15, 2018, 5:10 p.m. | #15
Am 14.06.2018 um 10:38 schrieb Janus Weil:
>> Also, there are AFAIU other similar weirdness with impure functions.
>> The
>> standard allows a compiler to optimize
>>
>> y = f(x) + f(x)
>>
>> into
>>
>> y = 2 * f(x)
>>
>> even if f is impure, which is totally bonkers. Or even not call f at
>> all,
>> if the compiler determines that y is not needed.
> Yes, that is the same kind of craziness. I hope gfortran does not actually do this?

I would vote for this, but currently it is not done unless
-faggressive-function-elimination is specified.

By the way, there is a bit of strangeness about this. People use -Ofast
knowing it will break all sorts of standards for numercial computation,
but they balk at using an optimization that is explicitly permitted
by the standard. Oh well...
Thomas Koenig June 15, 2018, 5:13 p.m. | #16
Am 14.06.2018 um 11:55 schrieb Jakub Jelinek:
> Or warn users that there is no evaluation ordering between the first and
> second operand, that both operands are evaluated and it is unspecified which
> is evaluated first?  Wouldn't you then just warn all the time?  Even without
> any impure procedures, you could have
> l = associated (m)
> if (l .and. m%t)
> or similar and m%t could be evaluated before l.

It is not the aim of this patch to catch any and all dubious programming
practices. It is the aim of this patch to flag a few well-known errors
that compiler writers use.

> I bet gfortran evaluates the side-effects left-to-right and evaluates both
> arguments always, right?

gfortran hands a TRUTH_AND_EXPR to the middle end. You know better what
this means than I do.
Janne Blomqvist June 15, 2018, 5:27 p.m. | #17
On Fri, Jun 15, 2018 at 8:06 PM, Thomas Koenig <tkoenig@netcologne.de>
wrote:

> Hi Janne,
>
> What about more complicated expressions like, say, "func() .and. flag .and
>> func2() .and. flag2" etc.? Can it move all the function calls to the end?
>>
>
> Not at the moment.
>
> Like many of the front-end optimizations, this aims for the easy 80%
> which are relatively easy to achieve.
>

Come to think about it, is this optimization also done for impure
functions? E.g. if func() changes the value of flag, then exchanging the
order they are evaluated might change the result. Of course, this is all
the fault of the programmer, but still..

But at least for pure functions, this optimization looks Ok.


IMHO it's a mistake that the standard refuses to specify the evaluation
>> strategy in the name of some rather minor hypothetical performance
>> benefit.
>>
>
> Well, it is the way it is, and it is not going to change. We might
> as well make the best of the situation as it is.
>
> There are other standards, such as the C standard, whose type-based
> aliasing rules can be used for optimization, but can also introduce
> hard-to-find bugs. gcc's philosophy is to use the optimization
> possibilities, and I see no reason for gfortran to deviate from that.
>

I agree, though I hope that the Fortran standard will eventually change to
fix this (and other craziness wrt. impure functions as well). But for the
time being, I agree we can do this.
Janne Blomqvist June 15, 2018, 5:49 p.m. | #18
On Fri, Jun 15, 2018 at 12:22 PM, Janus Weil <janus@gcc.gnu.org> wrote:

> Am 14. Juni 2018 12:40:19 MESZ schrieb Janne Blomqvist <
> blomqvist.janne@gmail.com>:
> In Fortran, it still feels like functions as such are second-class
> citizens. People seriously advise against using them. Doesn't really help
> the attractivity of the language.


Yes. Back when I followed c.l.f, several experts did advise people to never
use functions unless they were pure (or more or less effectively so, if
they didn't fulfill the standard requirements for purity). Considering that
at least some of those same experts were also part of the Fortran standards
committee, I just find it very strange that, to the best of my knowledge,
no effort to fix this has been done.

IMHO it is completely insane to optimize out impure functions, just for a
> little bit of speedup, but sacrificing compiler-independent results.
>
> I really don't understand why I'm so alone here with this opinion.
>

I would agree with you if there were some substantial majority opinion
among Fortran programmers that all the parts of a logical expression are
always evaluated, contrary to what the standard actually guarantees. But as
we have seen e.g. in the PR's that this patch attempt to fix, some people
actually seem to assume short-circuiting, e.g. by writing code like

a /= 0 .and. b/a > c

or

associated(t) .and. func(t)
Steve Kargl June 15, 2018, 6:38 p.m. | #19
On Fri, Jun 15, 2018 at 08:27:49PM +0300, Janne Blomqvist wrote:
> On Fri, Jun 15, 2018 at 8:06 PM, Thomas Koenig <tkoenig@netcologne.de>
> wrote:
> 
> >
> > What about more complicated expressions like, say, "func() .and. flag .and
> >> func2() .and. flag2" etc.? Can it move all the function calls to the end?
> >>
> >
> > Not at the moment.
> >
> > Like many of the front-end optimizations, this aims for the easy 80%
> > which are relatively easy to achieve.
> >
> 
> Come to think about it, is this optimization also done for impure
> functions? E.g. if func() changes the value of flag, then exchanging the
> order they are evaluated might change the result. Of course, this is all
> the fault of the programmer, but still..

It doesn't matter.  The code is invalid.  A compiler can do anything.

  F2018: 10.1.4 Evaluation of operations

  An intrinsic operation requires the values of its operands.

  Execution of a function reference in the logical expression in
  an IF statement (11.1.8.4), the mask expression in a WHERE
  statement (10.2.3.1), or the concurrent-limits and concurrent-steps
  in a FORALL statement (10.2.4) is permitted to define variables in
  the subsidiary action-stmt, where-assignment-stmt, or
  forall-assignment-stmt respectively.  Except in those cases:

   · the evaluation of a function reference shall neither affect
     nor be affected by the evaluation of any other entity within
     the statement;

   · if a function reference causes definition or undefinition of
     an actual argument of the function, that argument or any
     associated entities shall not appear elsewhere in the same
     statement.
> 
> But at least for pure functions, this optimization looks Ok.
> 

Why is everyone fixated on PURE vs IMPURE functions?  The Fortran
standard allows short circuiting regardless of the pureness of
a function.

  F2018: 10.1.5.4.2 Evaluation of logical intrinsic operations

  Once the interpretation of a logical intrinsic operation is
  established, the processor may evaluate any other expression
  that is logically equivalent, provided that the integrity of
  parentheses in any expression is not violated.

  Two expressions of type logical are logically equivalent if
  their values are equal for all possible values of their
  primaries.

'.false. .and. func()' and 'func() .and. .false.' are logically
equivalent to '.false.'

  F2018: 10.1.7 Evaluation of operands

  It is not necessary for a processor to evaluate all of ther
  operands of an expression, or to evaluate entirely each
  operand, if the value of the expression can be determined
  otherwise.

In fact, F2018 NOTE 10.28 describes this exact situation.

  NOTE 10.28
  This principle is most often applicable to logical expressions,
  zero-sized arrays, and zero-length strings, but it applies to
  all expressions.

  For example, in evaluating the expression

      X > Y .OR. L(Z)

  where X, Y, and Z are real and L is a function of type logical,
  the function reference L(Z) need not be evaluated if X is greater
  than Y.

There is no statement about L(Z) being PURE or not.

Thomas' patch is simply trying to alert users to the fact that
their code may not do what they think.  His check for pureness
versus impureness is an attempt to reduce warnings where the
the non-evaluation of pure function cannot change the outcome
of program.

Finally, I think that he can change the warning from 

+	  if (f->symtree == NULL || f->symtree->n.sym == NULL
+	      || !gfc_implicit_pure (f->symtree->n.sym))
+	    {
+	      if (name)
+		gfc_warning (OPT_Wsurprising, "Impure function %qs at %L "
+			     "might not be evaluated", name, &f->where);
+	      else
+		gfc_warning (OPT_Wsurprising, "Impure function at %L "
+			     "might not be evaluated", &f->where);
+	    }

to 

+	  if (f->symtree == NULL || f->symtree->n.sym == NULL
+	      || !gfc_implicit_pure (f->symtree->n.sym))
+	    {
+	      if (name)
+		gfc_warning (OPT_Wsurprising, "Function %qs at %L "
+			     "might not be evaluated", name, &f->where);
+	      else
+		gfc_warning (OPT_Wsurprising, "Function at %L "
+			     "might not be evaluated", &f->where);
+	    }
 
That is, remove the word 'Impure' as it can be misleading.
Janus Weil June 16, 2018, 10:35 a.m. | #20
Am 15. Juni 2018 19:10:01 MESZ schrieb Thomas Koenig <tkoenig@netcologne.de>:
>Am 14.06.2018 um 10:38 schrieb Janus Weil:
>>> Also, there are AFAIU other similar weirdness with impure functions.
>>> The
>>> standard allows a compiler to optimize
>>>
>>> y = f(x) + f(x)
>>>
>>> into
>>>
>>> y = 2 * f(x)
>>>
>>> even if f is impure, which is totally bonkers. Or even not call f at
>>> all,
>>> if the compiler determines that y is not needed.
>> Yes, that is the same kind of craziness. I hope gfortran does not
>actually do this?
>
>I would vote for this, but currently it is not done unless
>-faggressive-function-elimination is specified.

How about putting the short-circuiting of logical expressions with impure functions under the same flag?


>By the way, there is a bit of strangeness about this. People use -Ofast
>knowing it will break all sorts of standards for numercial computation

Things are a bit different with -Ofast and -ffast-math. They are not enabled by default, their effects are mentioned in the documentation, and they usually don't change the results totally.

Short-circuiting is being done by default, is not properly documented and can potentially have a huge impact on results (an impure function that is optimized away can totally alter the program flow and all numerical results).


>but they balk at using an optimization that is explicitly permitted
>by the standard.

The standard allowing it is really not any consolation, if my results are changed by this 'optimization' and I don't have any guarantee that different compilers do the same thing with my code.

That's simply a bad idea, no matter how many official approval stamps it gets.

Cheers,
Janus
Janus Weil June 16, 2018, 10:53 a.m. | #21
Am 15. Juni 2018 19:49:42 MESZ schrieb Janne Blomqvist <blomqvist.janne@gmail.com>:
>On Fri, Jun 15, 2018 at 12:22 PM, Janus Weil <janus@gcc.gnu.org> wrote:
>
>> Am 14. Juni 2018 12:40:19 MESZ schrieb Janne Blomqvist <
>> blomqvist.janne@gmail.com>:
>> In Fortran, it still feels like functions as such are second-class
>> citizens. People seriously advise against using them. Doesn't really
>help
>> the attractivity of the language.
>
>
>Yes. Back when I followed c.l.f, several experts did advise people to
>never
>use functions unless they were pure (or more or less effectively so, if
>they didn't fulfill the standard requirements for purity). Considering
>that
>at least some of those same experts were also part of the Fortran
>standards
>committee, I just find it very strange that, to the best of my
>knowledge,
>no effort to fix this has been done.

Very strange indeed.

If someone tells me "don't use functions in Fortran", then what I hear is "don't use Fortran, it's broken".

Why would you introduce a language feature that is not meant to be used?


>IMHO it is completely insane to optimize out impure functions, just for
>a
>> little bit of speedup, but sacrificing compiler-independent results.
>>
>> I really don't understand why I'm so alone here with this opinion.
>>
>
>I would agree with you if there were some substantial majority opinion
>among Fortran programmers that all the parts of a logical expression
>are
>always evaluated

Look, in principle I don't really care how many parts of an expression are evaluated. If the compiler can prove that a certain part of the expression can not change the end result and has no side effects, great, please optimize it away. If the compiler can not prove that, the expression should not be altered. An 'optimization' that changes results is just not useful. It's broken.

Cheers,
Janus
Janus Weil June 16, 2018, 11:09 a.m. | #22
Am 15. Juni 2018 20:38:17 MESZ schrieb Steve Kargl <sgk@troutmask.apl.washington.edu>:
>> But at least for pure functions, this optimization looks Ok.
>> 
>
>Why is everyone fixated on PURE vs IMPURE functions?

Simply because it makes a difference in this context!

That the Fortran standard does not acknowledge this fact is more than surprising to me, given that Fortran actually has a way to mark functions as pure/impure. Not all languages have that.

What's the use of a PURE keyword, if not to indicate to the compiler that certain optimizations can safely be done?

Cheers,
Janus
Thomas Koenig June 16, 2018, 11:20 a.m. | #23
Am 16.06.2018 um 12:53 schrieb Janus Weil:
>> Yes. Back when I followed c.l.f, several experts did advise people to
>> never
>> use functions unless they were pure (or more or less effectively so, if
>> they didn't fulfill the standard requirements for purity). Considering
>> that
>> at least some of those same experts were also part of the Fortran
>> standards
>> committee, I just find it very strange that, to the best of my
>> knowledge,
>> no effort to fix this has been done.

That is a bit distorted, at least from what I remember.

The advice was rather "If you use functions with side effects, you may
have problems". Something quite different.

> Very strange indeed.
> 
> If someone tells me "don't use functions in Fortran", then what I hear is "don't use Fortran, it's broken".

Well, nobody did, to the best of my knowledge (and I followed the same
discussions).
Steve Kargl June 16, 2018, 4:38 p.m. | #24
On Sat, Jun 16, 2018 at 01:09:36PM +0200, Janus Weil wrote:
> 
> 
> Am 15. Juni 2018 20:38:17 MESZ schrieb Steve Kargl <sgk@troutmask.apl.washington.edu>:
> >> But at least for pure functions, this optimization looks Ok.
> >> 
> >
> >Why is everyone fixated on PURE vs IMPURE functions?
> 
> Simply because it makes a difference in this context!

It does not!  A function marked as PURE by a programmer
must meet certain requirement of the Fortran standard.
An unmarked function or a function marked as IMPURE can
still meet those same requirements.  Marking something as 
IMPURE has only a single requirement in the standard.

   An impure elemental procedure processes array arguments
   in array element order.

That's it.  Marking a function as IMPURE does not mean 
the function has side effects.  It does not mean that
a function must be evaluated for each reference.  Are
you advocating that gfortran must evaluate ping() 
twice for

  impure real function ping()
  end function ping
  
  x = ping() + 0 * ping()
  end
Janus Weil June 16, 2018, 7:21 p.m. | #25
Am 16. Juni 2018 18:38:40 MESZ schrieb Steve Kargl <sgk@troutmask.apl.washington.edu>:
>On Sat, Jun 16, 2018 at 01:09:36PM +0200, Janus Weil wrote:
>> 
>> 
>> Am 15. Juni 2018 20:38:17 MESZ schrieb Steve Kargl
><sgk@troutmask.apl.washington.edu>:
>> >> But at least for pure functions, this optimization looks Ok.
>> >> 
>> >
>> >Why is everyone fixated on PURE vs IMPURE functions?
>> 
>> Simply because it makes a difference in this context!
>
>It does not!  A function marked as PURE by a programmer
>must meet certain requirement of the Fortran standard.
>An unmarked function or a function marked as IMPURE can
>still meet those same requirements.  Marking something as 
>IMPURE has only a single requirement in the standard.
>
>   An impure elemental procedure processes array arguments
>   in array element order.
>
>That's it.  Marking a function as IMPURE does not mean 
>the function has side effects.  It does not mean that
>a function must be evaluated for each reference.  Are
>you advocating that gfortran must evaluate ping() 
>twice for
>
>  impure real function ping()
>  end function ping
>  
>  x = ping() + 0 * ping()
>  end

Absolutely, sir. That's what I'm advocating. If someone deliberately declares a function as impure, he should be prepared to deal with the consequences. In general, though, one can wonder whether the compiler could not warn that the impure declaration is not necessary (or, in other words, that the function would be better declared as pure).

In any case, the example you're showing is probably not the most problematic one. I assume a much more frequent and critical case would be the one where people forget to mark a function that is effectively pure with the PURE attribute.

However, IIUC, gfortran has the ability to detect that a function meets all pureness requirements and mark it as implicit_pure, so that it can be treated like a pure procedure, even without any explicit PURE declaration by the programmer.

If that's the case, the scheme I proposed should not even have any major performance penalty, since the pureness of a function without explicit pure/impure declaration can be automatically detected, and appropriate optimizations can be chosen.

Therefore the most reasonable strategy I can see is still to apply short-circuiting only to pure functions (explicit or implicit) and avoid it for impure ones.

Cheers,
Janus
Steve Kargl June 16, 2018, 7:43 p.m. | #26
On Sat, Jun 16, 2018 at 09:21:16PM +0200, Janus Weil wrote:
> 
> 
> Am 16. Juni 2018 18:38:40 MESZ schrieb Steve Kargl <sgk@troutmask.apl.washington.edu>:
> >On Sat, Jun 16, 2018 at 01:09:36PM +0200, Janus Weil wrote:
> >> 
> >> 
> >> Am 15. Juni 2018 20:38:17 MESZ schrieb Steve Kargl
> ><sgk@troutmask.apl.washington.edu>:
> >> >> But at least for pure functions, this optimization looks Ok.
> >> >> 
> >> >
> >> >Why is everyone fixated on PURE vs IMPURE functions?
> >> 
> >> Simply because it makes a difference in this context!
> >
> >It does not!  A function marked as PURE by a programmer
> >must meet certain requirement of the Fortran standard.
> >An unmarked function or a function marked as IMPURE can
> >still meet those same requirements.  Marking something as 
> >IMPURE has only a single requirement in the standard.
> >
> >   An impure elemental procedure processes array arguments
> >   in array element order.
> >
> >That's it.  Marking a function as IMPURE does not mean 
> >the function has side effects.  It does not mean that
> >a function must be evaluated for each reference.  Are
> >you advocating that gfortran must evaluate ping() 
> >twice for
> >
> >  impure real function ping()
> >  end function ping
> >  
> >  x = ping() + 0 * ping()
> >  end
> 
> Absolutely, sir. That's what I'm advocating. If someone
> deliberately declares a function as impure, he should be
> prepared to deal with the consequences. In general, though,
> one can wonder whether the compiler could not warn that
> the impure declaration is not necessary (or, in other words,i
> that the function would be better declared as pure).

This is a different answer than what you gave in
the PR when I asked if you were special casing
.and. and .or.  It is trivial to force the evaluation
of operands.  I already posted the diff in the PR
where I special cased .and. and .or.

op1 .op. op2

needs to be converted to

tmp1 = op1
tmp2 = op2
x = tmp1 .op. tmp

for all binary operators. 

> In any case, the example you're showing is probably not
> the most problematic one. I assume a much more frequent
> and critical case would be the one where people forget
> to mark a function that is effectively pure with the
> PURE attribute.

See Fortran 2018, Note 10.28 (if I remember correctly).
That example explicitly involves .and., and it explicitly
states that a function need not be evaluate.   It does
not sya "pure function".  It says "function".
Steve Kargl June 16, 2018, 8 p.m. | #27
On Sat, Jun 16, 2018 at 12:43:08PM -0700, Steve Kargl wrote:
> 
> This is a different answer than what you gave in
> the PR when I asked if you were special casing
> .and. and .or.  It is trivial to force the evaluation
> of operands.  I already posted the diff in the PR
> where I special cased .and. and .or.
> 
> op1 .op. op2
> 
> needs to be converted to
> 
> tmp1 = op1
> tmp2 = op2
> x = tmp1 .op. tmp
> 
> for all binary operators. 
> 

Untested.

Index: trans-expr.c
===================================================================
--- trans-expr.c	(revision 261674)
+++ trans-expr.c	(working copy)
@@ -3429,6 +3429,10 @@ gfc_conv_expr_op (gfc_se * se, gfc_expr * expr)
   gfc_conv_expr (&rse, expr->value.op.op2);
   gfc_add_block_to_block (&se->pre, &rse.pre);
 
+  /* Force evaluation of op1 and op2 to prevent shorting-circuiting.  */
+  lse.expr = gfc_evaluate_now (lse.expr);
+  rse.expr = gfc_evaluate_noe (rse.expr);
+
   if (checkstring)
     {
       gfc_conv_string_parameter (&lse);
Jakub Jelinek June 16, 2018, 8:40 p.m. | #28
On Sat, Jun 16, 2018 at 01:00:14PM -0700, Steve Kargl wrote:
> Untested.
> 
> Index: trans-expr.c
> ===================================================================
> --- trans-expr.c	(revision 261674)
> +++ trans-expr.c	(working copy)
> @@ -3429,6 +3429,10 @@ gfc_conv_expr_op (gfc_se * se, gfc_expr * expr)
>    gfc_conv_expr (&rse, expr->value.op.op2);
>    gfc_add_block_to_block (&se->pre, &rse.pre);
>  
> +  /* Force evaluation of op1 and op2 to prevent shorting-circuiting.  */
> +  lse.expr = gfc_evaluate_now (lse.expr);
> +  rse.expr = gfc_evaluate_noe (rse.expr);

s/noe/now/

Or just use TRUTH_AND_EXPR/TRUTH_OR_EXPR instead of
TRUTH_ANDIF_EXPR/TRUTH_ORIF_EXPR, the difference between those is exactly in
short-circuiting.  ANDIF/ORIF are for C/C++ &&/||, AND/OR are for C/C++ &/|.

	Jakub
Janus Weil June 16, 2018, 8:42 p.m. | #29
Am 16. Juni 2018 21:43:08 MESZ schrieb Steve Kargl <sgk@troutmask.apl.washington.edu>:
>On Sat, Jun 16, 2018 at 09:21:16PM +0200, Janus Weil wrote:
>> 
>> 
>> Am 16. Juni 2018 18:38:40 MESZ schrieb Steve Kargl
><sgk@troutmask.apl.washington.edu>:
>> >On Sat, Jun 16, 2018 at 01:09:36PM +0200, Janus Weil wrote:
>> >> 
>> >> 
>> >> Am 15. Juni 2018 20:38:17 MESZ schrieb Steve Kargl
>> ><sgk@troutmask.apl.washington.edu>:
>> >> >> But at least for pure functions, this optimization looks Ok.
>> >> >> 
>> >> >
>> >> >Why is everyone fixated on PURE vs IMPURE functions?
>> >> 
>> >> Simply because it makes a difference in this context!
>> >
>> >It does not!  A function marked as PURE by a programmer
>> >must meet certain requirement of the Fortran standard.
>> >An unmarked function or a function marked as IMPURE can
>> >still meet those same requirements.  Marking something as 
>> >IMPURE has only a single requirement in the standard.
>> >
>> >   An impure elemental procedure processes array arguments
>> >   in array element order.
>> >
>> >That's it.  Marking a function as IMPURE does not mean 
>> >the function has side effects.  It does not mean that
>> >a function must be evaluated for each reference.  Are
>> >you advocating that gfortran must evaluate ping() 
>> >twice for
>> >
>> >  impure real function ping()
>> >  end function ping
>> >  
>> >  x = ping() + 0 * ping()
>> >  end
>> 
>> Absolutely, sir. That's what I'm advocating. If someone
>> deliberately declares a function as impure, he should be
>> prepared to deal with the consequences. In general, though,
>> one can wonder whether the compiler could not warn that
>> the impure declaration is not necessary (or, in other words,i
>> that the function would be better declared as pure).
>
>This is a different answer than what you gave in
>the PR when I asked if you were special casing
>.and. and .or.

Possibly we're having a communication issue here. I don't quite understand what you mean by "special casing".


>It is trivial to force the evaluation
>of operands.  I already posted the diff in the PR

I know it's rather trivial to do this. Instead of using temporaries, as you propose, one could also use a proper middle-end operator, like BIT_AND_EXPR instead of TRUTH_AND_EXPR.

The non-trivial part is obviously to find a consensus about what should be done.


>> In any case, the example you're showing is probably not
>> the most problematic one. I assume a much more frequent
>> and critical case would be the one where people forget
>> to mark a function that is effectively pure with the
>> PURE attribute.
>
>See Fortran 2018, Note 10.28 (if I remember correctly).
>That example explicitly involves .and., and it explicitly
>states that a function need not be evaluate.   It does
>not sya "pure function".  It says "function".

You can stop throwing standard quotes. I do know what the standard says about this by now.

Point is: It does not sound very reasonable to me and I agree with Janne that the standard's treatment of impure functions is somewhere between careless and insane.

If the standard says it's ok to optimize out any kind of function, I would tend to take this less as an encouragement to compiler writers to absolutely do this under any circumstance, but rather as saying: "Those compiler people are smart enough. They can figure out where it makes sense all by themselves."

I happen to hold the opinion that optimizing out a call to a pure function may be reasonable if it does not influence the result of an expression, but optimizing out an effectively impure function (i.e. one with side effects) is not a good idea at any time, since such an 'optimization' can drastically change the program flow and all numerical results of a piece of code. (Note the the standard does not require this kind of optimization either, and other popular compilers, like ifort, do not do it.)

If we can find some kind of agreement that what I'm proposing is a reasonable thing to do, I will certainly be happy to provide a patch (however I will not be able to get to it before next weekend).

Right now I feel like I'm running into various walls. Janne is basically the only one so far who expressed at least partial agreement with some of my notions, so I certainly don't feel very encouraged to even start working on a patch ...

Cheers,
Janus
Thomas Koenig June 16, 2018, 9:14 p.m. | #30
Hi Janus,

> I happen to hold the opinion that optimizing out a call to a pure 
> function may be reasonable if it does not influence the result of an 
> expression, but optimizing out an effectively impure function (i.e. one 
> with side effects) is not a good idea at any time, since such an 
> 'optimization' can drastically change the program flow and all numerical 
> results of a piece of code.

Well, I am of a different opinion, and so is the Fortran standard.

I think the compiler should strive to, in that order,

- conform to the language standard
- generate fast programs
- warn about features which may trip the user

In my patch, I have tried to do all three things at the same time, and
after this discussion, I still think that this is the right path
to follow.

So, here is an update on the patch, which also covers ALLOCATED.

Regression-tested. OK?

	Thomas
! { dg-do compile }
! { dg-additional-options "-Wsurprising -fdump-tree-original" }
! PR 85599 - check warning that impure function calls might be removed,
! and that logical expressions involving .and. and .or. will be
! reordered. 

MODULE M1
 TYPE T1
   LOGICAL :: T=.TRUE.
 END TYPE T1
CONTAINS
 SUBROUTINE S1(m)
   TYPE(T1), POINTER :: m
   TYPE(T1), ALLOCATABLE :: x
   IF (ASSOCIATED(m) .AND. m%T) THEN ! { dg-warning "does not guard expression" }
    WRITE(6,*) "X"
   ENDIF
   IF (ALLOCATED(x) .AND. x%T) THEN ! { dg-warning "does not guard expression" }
    WRITE(6,*) ""
   ENDIF
 END SUBROUTINE
END MODULE

module x
  logical :: flag = .true.
  integer :: count = 0
contains
  pure function f()
    logical :: f
    f = .true.
  end function f

  function g()
    logical :: g
    g = .false.
  end function g

  real function h()
     h = 1.2
     count = count + 1
  end function h
end module x

program main
  use x
  print *, g() .and. f() ! No warning, because g() follows all the rules of a pure function
  print *, f() .and. flag
  print *, h() > 1.0 .and. flag ! { dg-warning "might not be evaluated" }
  print *, h() < 1.0 .or. flag ! { dg-warning "might not be evaluated" }

end program main
! { dg-final { scan-tree-dump-times "flag &&" 2 "original" } }
! { dg-final { scan-tree-dump-times "flag \\|\\|" 1 "original" } }
Index: dump-parse-tree.c
===================================================================
--- dump-parse-tree.c	(Revision 261388)
+++ dump-parse-tree.c	(Arbeitskopie)
@@ -716,6 +716,8 @@ show_attr (symbol_attribute *attr, const char * mo
     fputs (" ELEMENTAL", dumpfile);
   if (attr->pure)
     fputs (" PURE", dumpfile);
+  if (attr->implicit_pure)
+    fputs (" IMPLICIT_PURE", dumpfile);
   if (attr->recursive)
     fputs (" RECURSIVE", dumpfile);
 
Index: resolve.c
===================================================================
--- resolve.c	(Revision 261388)
+++ resolve.c	(Arbeitskopie)
@@ -3807,7 +3807,43 @@ lookup_uop_fuzzy (const char *op, gfc_symtree *uop
   return gfc_closest_fuzzy_match (op, candidates);
 }
 
+/* Callback finding an impure function as an operand to an .and. or
+   .or.  expression.  Remember the last function warned about to
+   avoid double warnings when recursing.  */
 
+static int
+impure_function_callback (gfc_expr **e, int *walk_subtrees ATTRIBUTE_UNUSED,
+			  void *data)
+{
+  gfc_expr *f = *e;
+  const char *name;
+  static gfc_expr *last = NULL;
+  bool *found = (bool *) data;
+
+  if (f->expr_type == EXPR_FUNCTION)
+    {
+      *found = 1;
+      if (f != last && !pure_function (f, &name))
+	{
+	  /* This could still be a function without side effects, i.e.
+	     implicit pure.  Do not warn for that case.  */
+	  if (f->symtree == NULL || f->symtree->n.sym == NULL
+	      || !gfc_implicit_pure (f->symtree->n.sym))
+	    {
+	      if (name)
+		gfc_warning (OPT_Wsurprising, "Function %qs at %L "
+			     "might not be evaluated", name, &f->where);
+	      else
+		gfc_warning (OPT_Wsurprising, "Function at %L "
+			     "might not be evaluated", &f->where);
+	    }
+	}
+      last = f;
+    }
+
+  return 0;
+}
+
 /* Resolve an operator expression node.  This can involve replacing the
    operation with a user defined function call.  */
 
@@ -3910,6 +3946,8 @@ resolve_operator (gfc_expr *e)
     case INTRINSIC_NEQV:
       if (op1->ts.type == BT_LOGICAL && op2->ts.type == BT_LOGICAL)
 	{
+	  bool dont_move = false;
+
 	  e->ts.type = BT_LOGICAL;
 	  e->ts.kind = gfc_kind_max (op1, op2);
 	  if (op1->ts.kind < e->ts.kind)
@@ -3916,6 +3954,67 @@ resolve_operator (gfc_expr *e)
 	    gfc_convert_type (op1, &e->ts, 2);
 	  else if (op2->ts.kind < e->ts.kind)
 	    gfc_convert_type (op2, &e->ts, 2);
+
+	  if (e->value.op.op == INTRINSIC_AND || e->value.op.op == INTRINSIC_OR)
+	    {
+	      bool op1_f, op2_f;
+
+	      op1_f = false;
+	      op2_f = false;
+	      gfc_expr_walker (&op1, impure_function_callback, &op1_f);
+	      gfc_expr_walker (&op2, impure_function_callback, &op2_f);
+
+	      /* Some people code which depends on the short-circuiting that
+		 Fortran does not provide, such as
+
+		 if (associated(m) .and. m%t) then
+
+		 So, warn about this idiom. However, avoid breaking
+		 it on purpose.  */
+
+	      if (op1->expr_type == EXPR_FUNCTION && op1->value.function.isym)
+		{
+		  gfc_expr *e;
+		  bool warn = false;
+		  gfc_isym_id id;
+
+		  id = op1->value.function.isym->id;
+		  if (id == GFC_ISYM_ASSOCIATED)
+		    {
+		      e = op1->value.function.actual->expr;
+		      warn = op1->value.function.actual->next->expr == NULL;
+		    }
+		  else if (id == GFC_ISYM_ALLOCATED)
+		    {
+		      e = op1->value.function.actual->expr;
+		      warn = true;
+		    }
+
+		  if (warn && gfc_check_dependency (e, op2, true))
+		    {
+		      gfc_warning (OPT_Wsurprising, "%qs function call at "
+				   "%L does not guard expression at %L",
+				   op1->value.function.isym->name,
+				   &op1->where, &op2->where);
+		      dont_move = true;
+		    }
+		}
+
+	      /* A bit of optimization: Transfer if (f(x) .and. flag)
+		 into if (flag .and. f(x)), to save evaluation of a
+		 function.  The middle end should be capable of doing
+		 this with a TRUTH_AND_EXPR, but it currently does not do
+		 so. See PR 85599.  */
+
+	      if (!dont_move && op1_f && !op2_f)
+		{
+		  e->value.op.op1 = op2;
+		  e->value.op.op2 = op1;
+		  op1 = e->value.op.op1;
+		  op2 = e->value.op.op2;
+		}
+	    }
+
 	  break;
 	}
Steve Kargl June 16, 2018, 9:20 p.m. | #31
On Sat, Jun 16, 2018 at 10:42:16PM +0200, Janus Weil wrote:
> 
> Am 16. Juni 2018 21:43:08 MESZ schrieb Steve Kargl <sgk@troutmask.apl.washington.edu>:
> >On Sat, Jun 16, 2018 at 09:21:16PM +0200, Janus Weil wrote:
> >> 
> >> 
> >> Am 16. Juni 2018 18:38:40 MESZ schrieb Steve Kargl
> ><sgk@troutmask.apl.washington.edu>:
> >> >On Sat, Jun 16, 2018 at 01:09:36PM +0200, Janus Weil wrote:
> >> >> 
> >> >> 
> >> >> Am 15. Juni 2018 20:38:17 MESZ schrieb Steve Kargl
> >> ><sgk@troutmask.apl.washington.edu>:
> >> >> >> But at least for pure functions, this optimization looks Ok.
> >> >> >> 
> >> >> >
> >> >> >Why is everyone fixated on PURE vs IMPURE functions?
> >> >> 
> >> >> Simply because it makes a difference in this context!
> >> >
> >> >It does not!  A function marked as PURE by a programmer
> >> >must meet certain requirement of the Fortran standard.
> >> >An unmarked function or a function marked as IMPURE can
> >> >still meet those same requirements.  Marking something as 
> >> >IMPURE has only a single requirement in the standard.
> >> >
> >> >   An impure elemental procedure processes array arguments
> >> >   in array element order.
> >> >
> >> >That's it.  Marking a function as IMPURE does not mean 
> >> >the function has side effects.  It does not mean that
> >> >a function must be evaluated for each reference.  Are
> >> >you advocating that gfortran must evaluate ping() 
> >> >twice for
> >> >
> >> >  impure real function ping()
> >> >  end function ping
> >> >  
> >> >  x = ping() + 0 * ping()
> >> >  end
> >> 
> >> Absolutely, sir. That's what I'm advocating. If someone
> >> deliberately declares a function as impure, he should be
> >> prepared to deal with the consequences. In general, though,
> >> one can wonder whether the compiler could not warn that
> >> the impure declaration is not necessary (or, in other words,i
> >> that the function would be better declared as pure).
> >
> >This is a different answer than what you gave in
> >the PR when I asked if you were special casing
> >.and. and .or.
> 
> Possibly we're having a communication issue here.

It seems so.

> >> In any case, the example you're showing is probably not
> >> the most problematic one. I assume a much more frequent
> >> and critical case would be the one where people forget
> >> to mark a function that is effectively pure with the
> >> PURE attribute.
> >
> >See Fortran 2018, Note 10.28 (if I remember correctly).
> >That example explicitly involves .and., and it explicitly
> >states that a function need not be evaluate.   It does
> >not sya "pure function".  It says "function".
> 
> You can stop throwing standard quotes. I do know what the
> standard says about this by now.
> 
> Point is: It does not sound very reasonable to me and I agree
> with Janne that the standard's treatment of impure functions
> is somewhere between careless and insane.
> 
> If the standard says it's ok to optimize out any kind of
> function,

That is exactly what the Fortran standard says!  I've quoted the
relevant parts of Fortran standard, but you seem relucant to 
accept it and have ask me to stop supporting my position.

For the logical expression, 

x = .false. .and. check()

a Fortran compiler can replace this expression with

x = .false.

without evaluating check() (regardless of PURE vs IMPURE).

That is exactly what Fortran 2018 Note 10.28 says.  I 
won't quote it again.
Steve Kargl June 16, 2018, 9:22 p.m. | #32
On Sat, Jun 16, 2018 at 11:14:08PM +0200, Thomas Koenig wrote:
> Hi Janus,
> 
> > I happen to hold the opinion that optimizing out a call to a pure 
> > function may be reasonable if it does not influence the result of an 
> > expression, but optimizing out an effectively impure function (i.e. one 
> > with side effects) is not a good idea at any time, since such an 
> > 'optimization' can drastically change the program flow and all numerical 
> > results of a piece of code.
> 
> Well, I am of a different opinion, and so is the Fortran standard.
> 
> I think the compiler should strive to, in that order,
> 
> - conform to the language standard
> - generate fast programs
> - warn about features which may trip the user
> 
> In my patch, I have tried to do all three things at the same time, and
> after this discussion, I still think that this is the right path
> to follow.
> 

+1
Janus Weil June 16, 2018, 10:38 p.m. | #33
Am 16. Juni 2018 23:14:08 MESZ schrieb Thomas Koenig <tkoenig@netcologne.de>:
>Hi Janus,
>
>> I happen to hold the opinion that optimizing out a call to a pure 
>> function may be reasonable if it does not influence the result of an 
>> expression, but optimizing out an effectively impure function (i.e.
>one 
>> with side effects) is not a good idea at any time, since such an 
>> 'optimization' can drastically change the program flow and all
>numerical 
>> results of a piece of code.
>
>Well, I am of a different opinion

Of course opinions can differ. But could you explain what in my above statements sounds unreasonable to you? Otherwise it's hard to have an argument-based discussion.


>and so is the Fortran standard.

I don't think anything I wrote is in conflict with the standard. As I see it, the standard neither demands nor forbids short-circuiting. I'm just trying to fill that void in a reasonable way (having as much optimization as possible without altering results).


>I think the compiler should strive to, in that order,
>
>- conform to the language standard
>- generate fast programs
>- warn about features which may trip the user

I certainly agree to all of that, with the amendment that generating correct results should have precedence over generating fast programs.


>In my patch, I have tried to do all three things at the same time, and
>after this discussion, I still think that this is the right path
>to follow.
>
>So, here is an update on the patch, which also covers ALLOCATED.
>
>Regression-tested. OK?

I absolutely welcome the warnings for impure functions as second operand to .and./.or. operators, but (as noted earlier) I disagree with changing the ordering of operands. As our lengthy discussions show, things are already complicated enough, and this additional optimization just adds to the confusion IMHO.

Cheers,
Janus

Patch

Index: fortran/dump-parse-tree.c
===================================================================
--- fortran/dump-parse-tree.c	(Revision 261388)
+++ fortran/dump-parse-tree.c	(Arbeitskopie)
@@ -716,6 +716,8 @@  show_attr (symbol_attribute *attr, const char * mo
     fputs (" ELEMENTAL", dumpfile);
   if (attr->pure)
     fputs (" PURE", dumpfile);
+  if (attr->implicit_pure)
+    fputs (" IMPLICIT_PURE", dumpfile);
   if (attr->recursive)
     fputs (" RECURSIVE", dumpfile);
 
Index: fortran/resolve.c
===================================================================
--- fortran/resolve.c	(Revision 261388)
+++ fortran/resolve.c	(Arbeitskopie)
@@ -3807,7 +3807,43 @@  lookup_uop_fuzzy (const char *op, gfc_symtree *uop
   return gfc_closest_fuzzy_match (op, candidates);
 }
 
+/* Callback finding an impure function as an operand to an .and. or
+   .or.  expression.  Remember the last function warned about to
+   avoid double warnings when recursing.  */
 
+static int
+impure_function_callback (gfc_expr **e, int *walk_subtrees ATTRIBUTE_UNUSED,
+			  void *data)
+{
+  gfc_expr *f = *e;
+  const char *name;
+  static gfc_expr *last = NULL;
+  bool *found = (bool *) data;
+
+  if (f->expr_type == EXPR_FUNCTION)
+    {
+      *found = 1;
+      if (f != last && !pure_function (f, &name))
+	{
+	  /* This could still be a function without side effects, i.e.
+	     implicit pure.  Do not warn for that case.  */
+	  if (f->symtree == NULL || f->symtree->n.sym == NULL
+	      || !gfc_implicit_pure (f->symtree->n.sym))
+	    {
+	      if (name)
+		gfc_warning (OPT_Wsurprising, "Impure function %qs at %L "
+			     "might not be evaluated", name, &f->where);
+	      else
+		gfc_warning (OPT_Wsurprising, "Impure function at %L "
+			     "might not be evaluated", &f->where);
+	    }
+	}
+      last = f;
+    }
+
+  return 0;
+}
+
 /* Resolve an operator expression node.  This can involve replacing the
    operation with a user defined function call.  */
 
@@ -3910,6 +3946,8 @@  resolve_operator (gfc_expr *e)
     case INTRINSIC_NEQV:
       if (op1->ts.type == BT_LOGICAL && op2->ts.type == BT_LOGICAL)
 	{
+	  bool dont_move = false;
+
 	  e->ts.type = BT_LOGICAL;
 	  e->ts.kind = gfc_kind_max (op1, op2);
 	  if (op1->ts.kind < e->ts.kind)
@@ -3916,6 +3954,53 @@  resolve_operator (gfc_expr *e)
 	    gfc_convert_type (op1, &e->ts, 2);
 	  else if (op2->ts.kind < e->ts.kind)
 	    gfc_convert_type (op2, &e->ts, 2);
+
+	  if (e->value.op.op == INTRINSIC_AND || e->value.op.op == INTRINSIC_OR)
+	    {
+	      bool op1_f, op2_f;
+
+	      op1_f = false;
+	      op2_f = false;
+	      gfc_expr_walker (&op1, impure_function_callback, &op1_f);
+	      gfc_expr_walker (&op2, impure_function_callback, &op2_f);
+
+	      /* Some people code which depends on the short-circuiting that
+		 Fortran does not provide, such as
+
+		 if (associated(m) .and. m%t) then
+
+		 So, warn about this idiom. However, avoid breaking
+		 it on purpose.  */
+
+	      if (op1->expr_type == EXPR_FUNCTION && op1->value.function.isym
+		  && op1->value.function.isym->id == GFC_ISYM_ASSOCIATED)
+		{
+		  gfc_expr *e = op1->value.function.actual->expr;
+		  gfc_expr *en = op1->value.function.actual->next->expr;
+		  if (en == NULL && gfc_check_dependency (e, op2, true))
+		    {
+		      gfc_warning (OPT_Wsurprising, "%qs function call at %L does "
+				   "not guard expression at %L", "ASSOCIATED",
+				   &op1->where, &op2->where);
+		      dont_move = true;
+		    }
+		}
+
+	      /* A bit of optimization: Transfer if (f(x) .and. flag)
+		 into if (flag .and. f(x)), to save evaluation of a
+		 function.  The middle end should be capable of doing
+		 this with a TRUTH_AND_EXPR, but it currently does not do
+		 so. See PR 85599.  */
+
+	      if (!dont_move && op1_f && !op2_f)
+		{
+		  e->value.op.op1 = op2;
+		  e->value.op.op2 = op1;
+		  op1 = e->value.op.op1;
+		  op2 = e->value.op.op2;
+		}
+	    }
+
 	  break;
 	}
 
Index: testsuite/gfortran.dg/alloc_comp_default_init_2.f90
===================================================================
--- testsuite/gfortran.dg/alloc_comp_default_init_2.f90	(Revision 261388)
+++ testsuite/gfortran.dg/alloc_comp_default_init_2.f90	(Arbeitskopie)
@@ -11,7 +11,8 @@  program testprog
   integer, save :: callnb = 0
   type(t_type) :: this
   allocate ( this % chars ( 4))
-  if (.not.recursivefunc (this) .or. (callnb .ne. 10)) STOP 1
+  if (.not.recursivefunc (this)) STOP 1 
+  if (callnb .ne. 10) STOP 2
 contains
   recursive function recursivefunc ( this ) result ( match )
     type(t_type), intent(in) :: this