[ranger] x == -0.0 does not mean we can replace x with -0.0

Message ID	20220826154606.1155977-1-aldyh@redhat.com
State	New
Headers	show Return-Path: <gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org> DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9300B38515D9 To: GCC patches <gcc-patches@gcc.gnu.org> Subject: [PATCH] [ranger] x == -0.0 does not mean we can replace x with -0.0 Date: Fri, 26 Aug 2022 17:46:06 +0200 Message-Id: <20220826154606.1155977-1-aldyh@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII"; x-default=true Precedence: list From: Aldy Hernandez via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: Aldy Hernandez <aldyh@redhat.com> Cc: Jakub Jelinek <jakub@redhat.com> Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>
Series	[ranger] x == -0.0 does not mean we can replace x with -0.0 \| expand [ranger] x == -0.0 does not mean we can replace x with -0.0

Aldy Hernandez Aug. 26, 2022, 3:46 p.m. UTC

On the true side of x == -0.0, we can't just blindly value propagate
the -0.0 into every use of x because x could be +0.0 (or vice versa).

With this change, we only allow the transformation if
!HONOR_SIGNED_ZEROS or if the range is known not to contain 0.

Will commit after tests complete.

gcc/ChangeLog:

	* range-op-float.cc (foperator_equal::op1_range): Do not blindly
	copy op2 range when honoring signed zeros.
---
 gcc/range-op-float.cc | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

Jakub Jelinek Aug. 26, 2022, 4:40 p.m. UTC | #1

On Fri, Aug 26, 2022 at 05:46:06PM +0200, Aldy Hernandez wrote:
> On the true side of x == -0.0, we can't just blindly value propagate
> the -0.0 into every use of x because x could be +0.0 (or vice versa).
> 
> With this change, we only allow the transformation if
> !HONOR_SIGNED_ZEROS or if the range is known not to contain 0.
> 
> Will commit after tests complete.
> 
> gcc/ChangeLog:
> 
> 	* range-op-float.cc (foperator_equal::op1_range): Do not blindly
> 	copy op2 range when honoring signed zeros.
> ---
>  gcc/range-op-float.cc | 17 +++++++++++++++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
> index ad2fae578d2..ff9fe312acf 100644
> --- a/gcc/range-op-float.cc
> +++ b/gcc/range-op-float.cc
> @@ -252,8 +252,21 @@ foperator_equal::op1_range (frange &r, tree type,
>    switch (get_bool_state (r, lhs, type))
>      {
>      case BRS_TRUE:
> -      // If it's true, the result is the same as OP2.
> -      r = op2;
> +      if (HONOR_SIGNED_ZEROS (type)
> +	  && op2.contains_p (build_zero_cst (type)))

What exactly does op2.contains_p for zero?
Does it use real_compare/real_equal under the hood, so it is
equivalent to op2 == 0.0 or op2 == -0.0, where both will be
true whether op2 is -0.0 or 0.0?  Or is it more strict and
checks whether it is actually a positive zero?
In any case, for HONOR_SIGNED_ZEROS, VARYING is unnecessary, all you
can do is extend the r range to contain both -0.0 and +0.0 if it contains
at least one of them.

> +	{
> +	  // With signed zeros, x == -0.0 does not mean we can replace
> +	  // x with -0.0, because x may be either +0.0 or -0.0.
> +	  r.set_varying (type);
> +	}
> +      else
> +	{
> +	  // If it's true, the result is the same as OP2.
> +	  //
> +	  // If the range does not actually contain zeros, this should
> +	  // always be OK.
> +	  r = op2;
> +	}

!HONOR_SIGNED_ZEROS doesn't imply that negative zeros won't appear,
but says that user doesn't care if he gets a positive or negative zero
(unless !MODE_HAS_SIGNED_ZEROS - in that case -0.0 doesn't exist
and one doesn't need to bother with it).

Now, if all the code setting franges makes sure that for
MODE_HAS_SIGNED_ZEROS && !HONOR_SIGNED_ZEROS if +0.0 or -0.0 are inside
of a range, then both -0.0 and +0.0 are in the range, then yes,
you can use r = op2;

>        // The TRUE side of op1 == op2 implies op1 is !NAN.
>        r.set_nan (fp_prop::NO);
>        break;

	Jakub

Aldy Hernandez Aug. 29, 2022, 1:13 p.m. UTC | #2

On Fri, Aug 26, 2022 at 6:40 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Fri, Aug 26, 2022 at 05:46:06PM +0200, Aldy Hernandez wrote:
> > On the true side of x == -0.0, we can't just blindly value propagate
> > the -0.0 into every use of x because x could be +0.0 (or vice versa).
> >
> > With this change, we only allow the transformation if
> > !HONOR_SIGNED_ZEROS or if the range is known not to contain 0.
> >
> > Will commit after tests complete.
> >
> > gcc/ChangeLog:
> >
> >       * range-op-float.cc (foperator_equal::op1_range): Do not blindly
> >       copy op2 range when honoring signed zeros.
> > ---
> >  gcc/range-op-float.cc | 17 +++++++++++++++--
> >  1 file changed, 15 insertions(+), 2 deletions(-)
> >
> > diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
> > index ad2fae578d2..ff9fe312acf 100644
> > --- a/gcc/range-op-float.cc
> > +++ b/gcc/range-op-float.cc
> > @@ -252,8 +252,21 @@ foperator_equal::op1_range (frange &r, tree type,
> >    switch (get_bool_state (r, lhs, type))
> >      {
> >      case BRS_TRUE:
> > -      // If it's true, the result is the same as OP2.
> > -      r = op2;
> > +      if (HONOR_SIGNED_ZEROS (type)
> > +       && op2.contains_p (build_zero_cst (type)))
>
> What exactly does op2.contains_p for zero?
> Does it use real_compare/real_equal under the hood, so it is
> equivalent to op2 == 0.0 or op2 == -0.0, where both will be
> true whether op2 is -0.0 or 0.0?  Or is it more strict and
> checks whether it is actually a positive zero?

frange::contains_p() uses real_compare(), so both -0.0 and 0.0 will
come out as true:

  return (real_compare (GE_EXPR, TREE_REAL_CST_PTR (cst), &m_min)
      && real_compare (LE_EXPR, TREE_REAL_CST_PTR (cst), &m_max));

I thought about this some more, and you're right, dropping to VARYING
is a big hammer.

It seems to me we can do this optimization regardless, but then treat
positive and negative zero the same throughout the frange class.
Particularly, in frange::singleton_p().  We should never return TRUE
for any version of 0.0.  This will keep VRP from propagating an
incorrect 0.0, since all VRP does is propagate when a range is
provably a singleton.  Also, frange::zero_p() shall return true for
any version of 0.0.

I fleshed out all the relational operators (with endpoints) with this
approach, and everything worked out...including go, ada, and fortran,
which had given me headaches.  As a bonus, we can get rid of the
INF/NINF property bits I was keeping around, since now the range will
have actual endpoints.  I will repost the full frange endpoints patch
(and CC you) in the appropriate thread.

Aldy

> In any case, for HONOR_SIGNED_ZEROS, VARYING is unnecessary, all you
> can do is extend the r range to contain both -0.0 and +0.0 if it contains
> at least one of them.
>
> > +     {
> > +       // With signed zeros, x == -0.0 does not mean we can replace
> > +       // x with -0.0, because x may be either +0.0 or -0.0.
> > +       r.set_varying (type);
> > +     }
> > +      else
> > +     {
> > +       // If it's true, the result is the same as OP2.
> > +       //
> > +       // If the range does not actually contain zeros, this should
> > +       // always be OK.
> > +       r = op2;
> > +     }
>
> !HONOR_SIGNED_ZEROS doesn't imply that negative zeros won't appear,
> but says that user doesn't care if he gets a positive or negative zero
> (unless !MODE_HAS_SIGNED_ZEROS - in that case -0.0 doesn't exist
> and one doesn't need to bother with it).
>
> Now, if all the code setting franges makes sure that for
> MODE_HAS_SIGNED_ZEROS && !HONOR_SIGNED_ZEROS if +0.0 or -0.0 are inside
> of a range, then both -0.0 and +0.0 are in the range, then yes,
> you can use r = op2;
>
> >        // The TRUE side of op1 == op2 implies op1 is !NAN.
> >        r.set_nan (fp_prop::NO);
> >        break;
>
>         Jakub
>

Jakub Jelinek Aug. 29, 2022, 1:21 p.m. UTC | #3

On Mon, Aug 29, 2022 at 03:13:21PM +0200, Aldy Hernandez wrote:
> It seems to me we can do this optimization regardless, but then treat
> positive and negative zero the same throughout the frange class.
> Particularly, in frange::singleton_p().  We should never return TRUE
> for any version of 0.0.  This will keep VRP from propagating an
> incorrect 0.0, since all VRP does is propagate when a range is
> provably a singleton.  Also, frange::zero_p() shall return true for
> any version of 0.0.

Well, I think for HONOR_SIGNED_ZEROS it would be nice if frange was able to
differentiate between 0.0 and -0.0.
One reason is e.g. to be able to optimize copysign/signbit - if we can
prove that the sign bit on some value will be always cleared or always set,
we can fold those.
On the other side, with -fno-signed-zeros it is invalid to use
copysign/signbit on values that could be zero (well, nothing guarantees
whether the sign bit is set or clear), so for MODE_HAS_SIGNED_ZEROS &&
!HONOR_SIGNED_ZEROS it is best to treat contains_p as {-0.0,0.0} being
one thing (just not singleton_p) and not bother with details like whether
a range ends or starts with -0.0 or 0.0, either of them would work the same.
And for !MODE_HAS_SIGNED_ZEROS, obviously 0.0 can be singleton_p.

	Jakub

Aldy Hernandez Aug. 29, 2022, 1:31 p.m. UTC | #4

On Mon, Aug 29, 2022 at 3:22 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Mon, Aug 29, 2022 at 03:13:21PM +0200, Aldy Hernandez wrote:
> > It seems to me we can do this optimization regardless, but then treat
> > positive and negative zero the same throughout the frange class.
> > Particularly, in frange::singleton_p().  We should never return TRUE
> > for any version of 0.0.  This will keep VRP from propagating an
> > incorrect 0.0, since all VRP does is propagate when a range is
> > provably a singleton.  Also, frange::zero_p() shall return true for
> > any version of 0.0.
>
> Well, I think for HONOR_SIGNED_ZEROS it would be nice if frange was able to
> differentiate between 0.0 and -0.0.
> One reason is e.g. to be able to optimize copysign/signbit - if we can
> prove that the sign bit on some value will be always cleared or always set,
> we can fold those.
> On the other side, with -fno-signed-zeros it is invalid to use
> copysign/signbit on values that could be zero (well, nothing guarantees
> whether the sign bit is set or clear), so for MODE_HAS_SIGNED_ZEROS &&
> !HONOR_SIGNED_ZEROS it is best to treat contains_p as {-0.0,0.0} being
> one thing (just not singleton_p) and not bother with details like whether
> a range ends or starts with -0.0 or 0.0, either of them would work the same.
> And for !MODE_HAS_SIGNED_ZEROS, obviously 0.0 can be singleton_p.

*head explodes*

Ok, I think I can add a zero property we can track (like we do for
NAN), and set it appropriately at constant creation and upon results
from copysign/signbit.  However, I am running out of time before
Cauldron, so I think I'll just treat +-0.0 ambiguously for now, and do
that as a follow-up.

Aldy

Jeff Law Aug. 29, 2022, 2:20 p.m. UTC | #5

On 8/29/2022 7:31 AM, Aldy Hernandez via Gcc-patches wrote:
> On Mon, Aug 29, 2022 at 3:22 PM Jakub Jelinek <jakub@redhat.com> wrote:
>> On Mon, Aug 29, 2022 at 03:13:21PM +0200, Aldy Hernandez wrote:
>>> It seems to me we can do this optimization regardless, but then treat
>>> positive and negative zero the same throughout the frange class.
>>> Particularly, in frange::singleton_p().  We should never return TRUE
>>> for any version of 0.0.  This will keep VRP from propagating an
>>> incorrect 0.0, since all VRP does is propagate when a range is
>>> provably a singleton.  Also, frange::zero_p() shall return true for
>>> any version of 0.0.
>> Well, I think for HONOR_SIGNED_ZEROS it would be nice if frange was able to
>> differentiate between 0.0 and -0.0.
>> One reason is e.g. to be able to optimize copysign/signbit - if we can
>> prove that the sign bit on some value will be always cleared or always set,
>> we can fold those.
>> On the other side, with -fno-signed-zeros it is invalid to use
>> copysign/signbit on values that could be zero (well, nothing guarantees
>> whether the sign bit is set or clear), so for MODE_HAS_SIGNED_ZEROS &&
>> !HONOR_SIGNED_ZEROS it is best to treat contains_p as {-0.0,0.0} being
>> one thing (just not singleton_p) and not bother with details like whether
>> a range ends or starts with -0.0 or 0.0, either of them would work the same.
>> And for !MODE_HAS_SIGNED_ZEROS, obviously 0.0 can be singleton_p.
> *head explodes*
>
> Ok, I think I can add a zero property we can track (like we do for
> NAN), and set it appropriately at constant creation and upon results
> from copysign/signbit.  However, I am running out of time before
> Cauldron, so I think I'll just treat +-0.0 ambiguously for now, and do
> that as a follow-up.
We definitely want to be able to track +-0.0 and distinguish between 
them.  IIRC there's cases where you can start eliminating comparisons 
and arithmetic once you start tracking -0.0 state.
Jeff

Aldy Hernandez Aug. 29, 2022, 2:26 p.m. UTC | #6

On Mon, Aug 29, 2022 at 4:22 PM Jeff Law via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
>
>
> On 8/29/2022 7:31 AM, Aldy Hernandez via Gcc-patches wrote:
> > On Mon, Aug 29, 2022 at 3:22 PM Jakub Jelinek <jakub@redhat.com> wrote:
> >> On Mon, Aug 29, 2022 at 03:13:21PM +0200, Aldy Hernandez wrote:
> >>> It seems to me we can do this optimization regardless, but then treat
> >>> positive and negative zero the same throughout the frange class.
> >>> Particularly, in frange::singleton_p().  We should never return TRUE
> >>> for any version of 0.0.  This will keep VRP from propagating an
> >>> incorrect 0.0, since all VRP does is propagate when a range is
> >>> provably a singleton.  Also, frange::zero_p() shall return true for
> >>> any version of 0.0.
> >> Well, I think for HONOR_SIGNED_ZEROS it would be nice if frange was able to
> >> differentiate between 0.0 and -0.0.
> >> One reason is e.g. to be able to optimize copysign/signbit - if we can
> >> prove that the sign bit on some value will be always cleared or always set,
> >> we can fold those.
> >> On the other side, with -fno-signed-zeros it is invalid to use
> >> copysign/signbit on values that could be zero (well, nothing guarantees
> >> whether the sign bit is set or clear), so for MODE_HAS_SIGNED_ZEROS &&
> >> !HONOR_SIGNED_ZEROS it is best to treat contains_p as {-0.0,0.0} being
> >> one thing (just not singleton_p) and not bother with details like whether
> >> a range ends or starts with -0.0 or 0.0, either of them would work the same.
> >> And for !MODE_HAS_SIGNED_ZEROS, obviously 0.0 can be singleton_p.
> > *head explodes*
> >
> > Ok, I think I can add a zero property we can track (like we do for
> > NAN), and set it appropriately at constant creation and upon results
> > from copysign/signbit.  However, I am running out of time before
> > Cauldron, so I think I'll just treat +-0.0 ambiguously for now, and do
> > that as a follow-up.
> We definitely want to be able to track +-0.0 and distinguish between
> them.  IIRC there's cases where you can start eliminating comparisons
> and arithmetic once you start tracking -0.0 state.

Absolutely.  That was always the plan.  However, my goal was just to
add relop stubs so others could flesh out the rest.  Alas, I'm way
over that now :).  We're tracking NANs, endpoints, infinities, etc.

However, I'm hoping to forget as many floating point details, as fast
as possible, as soon as I can ;-).

Aldy

Jeff Law Aug. 29, 2022, 3:08 p.m. UTC | #7

On 8/29/2022 8:26 AM, Aldy Hernandez wrote:
> On Mon, Aug 29, 2022 at 4:22 PM Jeff Law via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>>
>> On 8/29/2022 7:31 AM, Aldy Hernandez via Gcc-patches wrote:
>>> On Mon, Aug 29, 2022 at 3:22 PM Jakub Jelinek <jakub@redhat.com> wrote:
>>>> On Mon, Aug 29, 2022 at 03:13:21PM +0200, Aldy Hernandez wrote:
>>>>> It seems to me we can do this optimization regardless, but then treat
>>>>> positive and negative zero the same throughout the frange class.
>>>>> Particularly, in frange::singleton_p().  We should never return TRUE
>>>>> for any version of 0.0.  This will keep VRP from propagating an
>>>>> incorrect 0.0, since all VRP does is propagate when a range is
>>>>> provably a singleton.  Also, frange::zero_p() shall return true for
>>>>> any version of 0.0.
>>>> Well, I think for HONOR_SIGNED_ZEROS it would be nice if frange was able to
>>>> differentiate between 0.0 and -0.0.
>>>> One reason is e.g. to be able to optimize copysign/signbit - if we can
>>>> prove that the sign bit on some value will be always cleared or always set,
>>>> we can fold those.
>>>> On the other side, with -fno-signed-zeros it is invalid to use
>>>> copysign/signbit on values that could be zero (well, nothing guarantees
>>>> whether the sign bit is set or clear), so for MODE_HAS_SIGNED_ZEROS &&
>>>> !HONOR_SIGNED_ZEROS it is best to treat contains_p as {-0.0,0.0} being
>>>> one thing (just not singleton_p) and not bother with details like whether
>>>> a range ends or starts with -0.0 or 0.0, either of them would work the same.
>>>> And for !MODE_HAS_SIGNED_ZEROS, obviously 0.0 can be singleton_p.
>>> *head explodes*
>>>
>>> Ok, I think I can add a zero property we can track (like we do for
>>> NAN), and set it appropriately at constant creation and upon results
>>> from copysign/signbit.  However, I am running out of time before
>>> Cauldron, so I think I'll just treat +-0.0 ambiguously for now, and do
>>> that as a follow-up.
>> We definitely want to be able to track +-0.0 and distinguish between
>> them.  IIRC there's cases where you can start eliminating comparisons
>> and arithmetic once you start tracking -0.0 state.
> Absolutely.  That was always the plan.  However, my goal was just to
> add relop stubs so others could flesh out the rest.  Alas, I'm way
> over that now :).  We're tracking NANs, endpoints, infinities, etc.
Well, we'll probably cycle back and forth a bit between the solver and 
figuring out what we need to track.

Related: I had a short email thread with Siddhesh and Carlos WRT the 
idea of putting in some __builtin_unreachables into the math routines.  
They're generally OK with it, though we do need to verify those 
conditionals aren't generating extra code.   So there's a potential path 
forward for that side of the problem as well.


>
> However, I'm hoping to forget as many floating point details, as fast
> as possible, as soon as I can ;-).
Actually FP isn't that bad -- I'd largely avoided it for decades, but 
didn't have a choice earlier this year.  And there's a lot more headroom 
for improvements in the FP space than the integer space IMHO.

Jeff

Toon Moene Aug. 29, 2022, 3:13 p.m. UTC | #8

On 8/29/22 17:08, Jeff Law via Gcc-patches wrote:

>> However, I'm hoping to forget as many floating point details, as fast
>> as possible, as soon as I can ;-).

> Actually FP isn't that bad -- I'd largely avoided it for decades, but 
> didn't have a choice earlier this year.  And there's a lot more headroom 
> for improvements in the FP space than the integer space IMHO.

One of the more interesting ones is to try to limit the range of the 
input to the trigonometric functions - that way you could use ones 
without any argument reduction phase ...

Aldy Hernandez Aug. 29, 2022, 3:29 p.m. UTC | #9

On Mon, Aug 29, 2022 at 5:08 PM Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
> On 8/29/2022 8:26 AM, Aldy Hernandez wrote:
> > On Mon, Aug 29, 2022 at 4:22 PM Jeff Law via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >>
> >> On 8/29/2022 7:31 AM, Aldy Hernandez via Gcc-patches wrote:
> >>> On Mon, Aug 29, 2022 at 3:22 PM Jakub Jelinek <jakub@redhat.com> wrote:
> >>>> On Mon, Aug 29, 2022 at 03:13:21PM +0200, Aldy Hernandez wrote:
> >>>>> It seems to me we can do this optimization regardless, but then treat
> >>>>> positive and negative zero the same throughout the frange class.
> >>>>> Particularly, in frange::singleton_p().  We should never return TRUE
> >>>>> for any version of 0.0.  This will keep VRP from propagating an
> >>>>> incorrect 0.0, since all VRP does is propagate when a range is
> >>>>> provably a singleton.  Also, frange::zero_p() shall return true for
> >>>>> any version of 0.0.
> >>>> Well, I think for HONOR_SIGNED_ZEROS it would be nice if frange was able to
> >>>> differentiate between 0.0 and -0.0.
> >>>> One reason is e.g. to be able to optimize copysign/signbit - if we can
> >>>> prove that the sign bit on some value will be always cleared or always set,
> >>>> we can fold those.
> >>>> On the other side, with -fno-signed-zeros it is invalid to use
> >>>> copysign/signbit on values that could be zero (well, nothing guarantees
> >>>> whether the sign bit is set or clear), so for MODE_HAS_SIGNED_ZEROS &&
> >>>> !HONOR_SIGNED_ZEROS it is best to treat contains_p as {-0.0,0.0} being
> >>>> one thing (just not singleton_p) and not bother with details like whether
> >>>> a range ends or starts with -0.0 or 0.0, either of them would work the same.
> >>>> And for !MODE_HAS_SIGNED_ZEROS, obviously 0.0 can be singleton_p.
> >>> *head explodes*
> >>>
> >>> Ok, I think I can add a zero property we can track (like we do for
> >>> NAN), and set it appropriately at constant creation and upon results
> >>> from copysign/signbit.  However, I am running out of time before
> >>> Cauldron, so I think I'll just treat +-0.0 ambiguously for now, and do
> >>> that as a follow-up.
> >> We definitely want to be able to track +-0.0 and distinguish between
> >> them.  IIRC there's cases where you can start eliminating comparisons
> >> and arithmetic once you start tracking -0.0 state.
> > Absolutely.  That was always the plan.  However, my goal was just to
> > add relop stubs so others could flesh out the rest.  Alas, I'm way
> > over that now :).  We're tracking NANs, endpoints, infinities, etc.
> Well, we'll probably cycle back and forth a bit between the solver and
> figuring out what we need to track.
>
> Related: I had a short email thread with Siddhesh and Carlos WRT the
> idea of putting in some __builtin_unreachables into the math routines.
> They're generally OK with it, though we do need to verify those
> conditionals aren't generating extra code.   So there's a potential path
> forward for that side of the problem as well.
>
>
> >
> > However, I'm hoping to forget as many floating point details, as fast
> > as possible, as soon as I can ;-).
> Actually FP isn't that bad -- I'd largely avoided it for decades, but
> didn't have a choice earlier this year.  And there's a lot more headroom
> for improvements in the FP space than the integer space IMHO.

Absolutely.  Just working with basic relationals I've noticed that we
generate absolute garbage for some simple testcases.  I'm amazed what
makes it all the way to the assembly because we fail to fold simple
things.

Aldy

Jeff Law Aug. 29, 2022, 5:07 p.m. UTC | #10

On 8/29/2022 9:13 AM, Toon Moene wrote:
> On 8/29/22 17:08, Jeff Law via Gcc-patches wrote:
>
>>> However, I'm hoping to forget as many floating point details, as fast
>>> as possible, as soon as I can ;-).
>
>> Actually FP isn't that bad -- I'd largely avoided it for decades, but 
>> didn't have a choice earlier this year.  And there's a lot more 
>> headroom for improvements in the FP space than the integer space IMHO.
>
> One of the more interesting ones is to try to limit the range of the 
> input to the trigonometric functions - that way you could use ones 
> without any argument reduction phase ...
The difficult part is that most of the trig stuff is in libraries, so we 
don't have visibility into the full range.

What we do sometimes have is knowledge that the special values are 
already handled which allows us to do things like automatically 
transform a division into estimation + NR correction steps (atan2).

I guess we could do specialization based on the input range.  So rather 
than calling "sin" we could call a special one that didn't have the 
reduction step when we know the input value is in a sensible range.

Jeff

Li, Pan2 via Gcc-patches Aug. 29, 2022, 5:37 p.m. UTC | #11

> On Aug 29, 2022, at 1:07 PM, Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> 
> ...
> I guess we could do specialization based on the input range.  So rather than calling "sin" we could call a special one that didn't have the reduction step when we know the input value is in a sensible range.

There's some precedent for that, though for a somewhat different reason: functions like "log1p".  And in fact, it would make sense for the optimizer to transform log calls into log1p calls when the range is known to be right for doing so.

	paul

Toon Moene Aug. 29, 2022, 7:04 p.m. UTC | #12

On 8/29/22 19:07, Jeff Law via Gcc-patches wrote:

>> One of the more interesting ones is to try to limit the range of the 
>> input to the trigonometric functions - that way you could use ones 
>> without any argument reduction phase ...

> The difficult part is that most of the trig stuff is in libraries, so we 
> don't have visibility into the full range.
> 
> What we do sometimes have is knowledge that the special values are 
> already handled which allows us to do things like automatically 
> transform a division into estimation + NR correction steps (atan2).
> 
> I guess we could do specialization based on the input range.  So rather 
> than calling "sin" we could call a special one that didn't have the 
> reduction step when we know the input value is in a sensible range.

Exactly. It's probably not that hard to have sin/cos/tan with a special 
entry point that foregoes the whole argument reduction step.

In every weather forecast, you have to compute the local solar height 
(to get the effects of solar radiation correct) every time step, in 
every grid point.

You *know* that angle is between 0 and 90 degrees, as are all the angles 
that go into that computation (latitude, longitude (and time [hour of 
the day, day of the year]).

Aldy Hernandez Aug. 31, 2022, 10:08 a.m. UTC | #13

I think we can model the signed zero problem by keeping track of a
sign property, similar to how we keep track of a NAN property.  The
property can be yes, no, or unknown, and would apply to the entire
range.

[0.0, 0.0] SIGN     => -0.0 singleton
[0.0, 0.0] !SIGN    => +0.0 singleton
[0.0, 0.0] VARYING  => [-0.0, +0.0] sign unknown

frange::singleton_p() would return the appropriate zero if the sign
bit is definitely known, otherwise it would return NULL, which would
keep VRP from propagating it.

This is a sample of how I envision it working with __builtin_signbit:

=========== BB 2 ============
Imports: x_3(D)
Exports: _1  x_3(D)
         _1 : x_3(D)(I)
x_3(D)  [frange] float VARYING
    <bb 2> :
    _1 = __builtin_signbit (x_3(D));
    if (_1 == 0)
      goto <bb 3>; [INV]
    else
      goto <bb 4>; [INV]

2->3  (T) _1 :  [irange] int [0, 0] NONZERO 0x0
2->3  (T) x_3(D) :      [frange] float [0.0,  Inf] !SIGN
2->4  (F) _1 :  [irange] int [-INF, -1][1, +INF]
2->4  (F) x_3(D) :      [frange] float [ -Inf, 0.0] SIGN

That is, on the TRUE side x_3 can be assumed to be positive, including
the zero.  On the FALSE side x_3 is negative, also including the zero.

We can keep the endpoints in sync with the sign bit for free, since we
have endpoints.  So, setting the sign bit on a range to either yes or
no, would automatically intersect it to [-INF, 0] or [0, +INF]
respectively.

With this in play, VRP could propagate a 0.0 if it knows the sign.  For example:

  if (x == 0.0 && __builtin_signbit (x) == 0)
    bark(x);

...would generate:

=========== BB 2 ============
Imports: x_3(D)
Exports: x_3(D)
x_3(D)  [frange] float VARYING
    <bb 2> :
    if (x_3(D) == 0.0)
      goto <bb 3>; [INV]
    else
      goto <bb 5>; [INV]

2->3  (T) x_3(D) :      [frange] float [0.0, 0.0] !NAN

=========== BB 3 ============
Imports: x_3(D)
Exports: _1  x_3(D)
         _1 : x_3(D)(I)
x_3(D)  [frange] float [0.0, 0.0] !NAN
    <bb 3> :
    _1 = __builtin_signbit (x_3(D));
    if (_1 == 0)
      goto <bb 4>; [INV]
    else
      goto <bb 5>; [INV]

3->4  (T) _1 :  [irange] int [0, 0] NONZERO 0x0
3->4  (T) x_3(D) :      [frange] float [0.0, 0.0] !NAN !SIGN
3->5  (F) _1 :  [irange] int [-INF, -1][1, +INF]
3->5  (F) x_3(D) :      [frange] float [0.0, 0.0] !NAN SIGN

=========== BB 4 ============
x_3(D)  [frange] float [0.0, 0.0] !NAN !SIGN
    <bb 4> :
    bark (0.0);

That is, on the 2->3 edge we know x_3 is 0.0 and !NAN, but have no
information on the sign bit.  Then out of BB3, we know both that x_3
is 0.0 as well as the appropriate sign.  Ultimately this leads us to
propagate +0.0 in BB4.

I have most^Wall of it coded without regressions, with the exception
of how to coerce the range-op machinery to play nice with builtins
(details below).  But I wanted to make sure we're all on the same
page.

A couple questions:

Can I safely assume that +0.0 in the source (say, x = 0.0) has the
sign bit cleared, and vice versa for -0.0?

What's the deal with __builtin_signbit?  Can I fold it to 0/1, or must
I return the actual signbit, because I see differing behavior whether
we fold a known value or not:

abulafia:~$ cat a.c
float nzero = -0.0;

main(){
    printf("0x%x\n", __builtin_signbit(-0.0));
    printf("0x%x\n", __builtin_signbit(nzero));
}
abulafia:~$ gcc a.c -w && ./a.out
0x1
0x80000000

When Andrew comes back from PTO, we'll need to talk about propagating
builtins.  Currently range-ops' op1_range is used to unwind back from
conditionals.  For example:

    _1 = x_9 + 5
    if (_1 == 0)

On the TRUE side we use op1_range to solve:

    0 = x_9 + 5;

We currently only handle assignments and conditionals.  We would need
to ability to wind back through builtins since __builtin_signbit is
not part of the IL:

    _1 = __builtin_signbit (x_3(D));
    if (_1 == 0)

We have no way to augment the range for x_3 when examining the
builtin.    We do have a way of folding the builtin on a forward
analysis, but that's a separate thing.

Thoughts?
Aldy

On Mon, Aug 29, 2022 at 3:22 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Mon, Aug 29, 2022 at 03:13:21PM +0200, Aldy Hernandez wrote:
> > It seems to me we can do this optimization regardless, but then treat
> > positive and negative zero the same throughout the frange class.
> > Particularly, in frange::singleton_p().  We should never return TRUE
> > for any version of 0.0.  This will keep VRP from propagating an
> > incorrect 0.0, since all VRP does is propagate when a range is
> > provably a singleton.  Also, frange::zero_p() shall return true for
> > any version of 0.0.
>
> Well, I think for HONOR_SIGNED_ZEROS it would be nice if frange was able to
> differentiate between 0.0 and -0.0.
> One reason is e.g. to be able to optimize copysign/signbit - if we can
> prove that the sign bit on some value will be always cleared or always set,
> we can fold those.
> On the other side, with -fno-signed-zeros it is invalid to use
> copysign/signbit on values that could be zero (well, nothing guarantees
> whether the sign bit is set or clear), so for MODE_HAS_SIGNED_ZEROS &&
> !HONOR_SIGNED_ZEROS it is best to treat contains_p as {-0.0,0.0} being
> one thing (just not singleton_p) and not bother with details like whether
> a range ends or starts with -0.0 or 0.0, either of them would work the same.
> And for !MODE_HAS_SIGNED_ZEROS, obviously 0.0 can be singleton_p.
>
>         Jakub
>

[ranger] x == -0.0 does not mean we can replace x with -0.0

Commit Message

Comments

Patch