diff mbox series

Implement POINTER_DIFF_EXPR entry in range-op.

Message ID 20210903144158.742979-1-aldyh@redhat.com
State New
Headers show
Series Implement POINTER_DIFF_EXPR entry in range-op. | expand

Commit Message

Aldy Hernandez Sept. 3, 2021, 2:41 p.m. UTC
[Andrew, do you see any problem with using the minus relational code
here?  It seems like everything matches up.]

I've seen cases in the upcoming jump threader enhancements where we see
a difference of two pointers that are known to be equivalent, and yet we
fail to return 0 for the range.  This is because we have no working
range-op entry for POINTER_DIFF_EXPR.  The entry we currently have is
a mere placeholder to avoid ignoring POINTER_DIFF_EXPR's so
adjust_pointer_diff_expr() could get a whack at it here:

//	def = __builtin_memchr (arg, 0, sz)
//	n = def - arg
//
// The range for N can be narrowed to [0, PTRDIFF_MAX - 1].

This patch adds the relational magic to range-op, which we can just
steal from the minus_expr code.

Testing currently in progress.  I will commit pending tests.

gcc/ChangeLog:

	* range-op.cc (operator_minus::op1_op2_relation_effect): Abstract
	out to...
	(minus_op1_op2_relation_effect): ...here.
	(class operator_pointer_diff): New.
	(operator_pointer_diff::op1_op2_relation_effect): Call
	minus_op1_op2_relation_effect.
	(integral_table::integral_table): Add entry for POINTER_DIFF_EXPR.
---
 gcc/range-op.cc | 45 ++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 38 insertions(+), 7 deletions(-)

Comments

Andrew MacLeod Sept. 3, 2021, 3:05 p.m. UTC | #1
On 9/3/21 10:41 AM, Aldy Hernandez wrote:
> [Andrew, do you see any problem with using the minus relational code
> here?  It seems like everything matches up.]
>
> I've seen cases in the upcoming jump threader enhancements where we see
> a difference of two pointers that are known to be equivalent, and yet we
> fail to return 0 for the range.  This is because we have no working
> range-op entry for POINTER_DIFF_EXPR.  The entry we currently have is
> a mere placeholder to avoid ignoring POINTER_DIFF_EXPR's so
> adjust_pointer_diff_expr() could get a whack at it here:
>
> //	def = __builtin_memchr (arg, 0, sz)
> //	n = def - arg
> //
> // The range for N can be narrowed to [0, PTRDIFF_MAX - 1].
>
In theory... but do the non-equality relations make sense?   ie ptr1 < 
ptr2   ?   Perhaps there is no harm in those..  Probably undefined 
behaviour if they are not related so we can do whatever...


Andrew
Jeff Law Sept. 3, 2021, 3:09 p.m. UTC | #2
On 9/3/2021 9:05 AM, Andrew MacLeod via Gcc-patches wrote:
> On 9/3/21 10:41 AM, Aldy Hernandez wrote:
>> [Andrew, do you see any problem with using the minus relational code
>> here?  It seems like everything matches up.]
>>
>> I've seen cases in the upcoming jump threader enhancements where we see
>> a difference of two pointers that are known to be equivalent, and yet we
>> fail to return 0 for the range.  This is because we have no working
>> range-op entry for POINTER_DIFF_EXPR.  The entry we currently have is
>> a mere placeholder to avoid ignoring POINTER_DIFF_EXPR's so
>> adjust_pointer_diff_expr() could get a whack at it here:
>>
>> //    def = __builtin_memchr (arg, 0, sz)
>> //    n = def - arg
>> //
>> // The range for N can be narrowed to [0, PTRDIFF_MAX - 1].
>>
> In theory... but do the non-equality relations make sense?   ie ptr1 < 
> ptr2   ?   Perhaps there is no harm in those..  Probably undefined 
> behaviour if they are not related so we can do whatever...
I'm pretty sure they're in the realm of undefined behavior if the 
pointers point to different objects.  There was some code that would 
have cared about this, but I think it mostly got cleaned up in response 
to Martin S's diagnostic work over the last couple years.

Jeff
Aldy Hernandez Sept. 3, 2021, 3:35 p.m. UTC | #3
On 9/3/21 5:05 PM, Andrew MacLeod wrote:
> On 9/3/21 10:41 AM, Aldy Hernandez wrote:
>> [Andrew, do you see any problem with using the minus relational code
>> here?  It seems like everything matches up.]
>>
>> I've seen cases in the upcoming jump threader enhancements where we see
>> a difference of two pointers that are known to be equivalent, and yet we
>> fail to return 0 for the range.  This is because we have no working
>> range-op entry for POINTER_DIFF_EXPR.  The entry we currently have is
>> a mere placeholder to avoid ignoring POINTER_DIFF_EXPR's so
>> adjust_pointer_diff_expr() could get a whack at it here:
>>
>> //    def = __builtin_memchr (arg, 0, sz)
>> //    n = def - arg
>> //
>> // The range for N can be narrowed to [0, PTRDIFF_MAX - 1].
>>
> In theory... but do the non-equality relations make sense?   ie ptr1 < 
> ptr2   ?   Perhaps there is no harm in those..  Probably undefined 
> behaviour if they are not related so we can do whatever...

I originally just copy-pasted the stuff dealing with == and !=, but 
figured why not piggyback onto the existing minus code.

I'm ok with doing whatever for undefined behavior.  At least it's all 
encapsulated in a single place.

Will commit when tests finish.

Thanks.
Aldy
Jakub Jelinek Sept. 3, 2021, 3:41 p.m. UTC | #4
On Fri, Sep 03, 2021 at 09:09:59AM -0600, Jeff Law via Gcc-patches wrote:
> 
> 
> On 9/3/2021 9:05 AM, Andrew MacLeod via Gcc-patches wrote:
> > On 9/3/21 10:41 AM, Aldy Hernandez wrote:
> > > [Andrew, do you see any problem with using the minus relational code
> > > here?  It seems like everything matches up.]
> > > 
> > > I've seen cases in the upcoming jump threader enhancements where we see
> > > a difference of two pointers that are known to be equivalent, and yet we
> > > fail to return 0 for the range.  This is because we have no working
> > > range-op entry for POINTER_DIFF_EXPR.  The entry we currently have is
> > > a mere placeholder to avoid ignoring POINTER_DIFF_EXPR's so
> > > adjust_pointer_diff_expr() could get a whack at it here:
> > > 
> > > //    def = __builtin_memchr (arg, 0, sz)
> > > //    n = def - arg
> > > //
> > > // The range for N can be narrowed to [0, PTRDIFF_MAX - 1].
> > > 
> > In theory... but do the non-equality relations make sense?   ie ptr1 <
> > ptr2   ?   Perhaps there is no harm in those..  Probably undefined
> > behaviour if they are not related so we can do whatever...
> I'm pretty sure they're in the realm of undefined behavior if the pointers
> point to different objects.  There was some code that would have cared about
> this, but I think it mostly got cleaned up in response to Martin S's
> diagnostic work over the last couple years.

Yeah.  Just note, [0, PTRDIFF_MAX - 1] range will be only for builtins
like memchr/strchr/mempcpy etc. that return either the passed argument or that pointer
incremented some number of times.  Generally pointer arith can go in both
directions, so POINTER_DIFF_EXPR can be negative too.
If one of the pointers points into some VAR_DECL object, the range could be
narrowed from the size of that object though.
char a[26];

long
foo (long i, char *q)
{
  char *p = &a[0] + i;
  return q - p;
}
The minimum result will be if q points to &a[0] and p to &a[26], i.e.
-26, maximum the other way, so [-26, 26] range.

	Jakub
Aldy Hernandez Sept. 3, 2021, 4:39 p.m. UTC | #5
On 9/3/21 5:41 PM, Jakub Jelinek wrote:
> On Fri, Sep 03, 2021 at 09:09:59AM -0600, Jeff Law via Gcc-patches wrote:
>>
>>
>> On 9/3/2021 9:05 AM, Andrew MacLeod via Gcc-patches wrote:
>>> On 9/3/21 10:41 AM, Aldy Hernandez wrote:
>>>> [Andrew, do you see any problem with using the minus relational code
>>>> here?  It seems like everything matches up.]
>>>>
>>>> I've seen cases in the upcoming jump threader enhancements where we see
>>>> a difference of two pointers that are known to be equivalent, and yet we
>>>> fail to return 0 for the range.  This is because we have no working
>>>> range-op entry for POINTER_DIFF_EXPR.  The entry we currently have is
>>>> a mere placeholder to avoid ignoring POINTER_DIFF_EXPR's so
>>>> adjust_pointer_diff_expr() could get a whack at it here:
>>>>
>>>> //    def = __builtin_memchr (arg, 0, sz)
>>>> //    n = def - arg
>>>> //
>>>> // The range for N can be narrowed to [0, PTRDIFF_MAX - 1].
>>>>
>>> In theory... but do the non-equality relations make sense?   ie ptr1 <
>>> ptr2   ?   Perhaps there is no harm in those..  Probably undefined
>>> behaviour if they are not related so we can do whatever...
>> I'm pretty sure they're in the realm of undefined behavior if the pointers
>> point to different objects.  There was some code that would have cared about
>> this, but I think it mostly got cleaned up in response to Martin S's
>> diagnostic work over the last couple years.
> 
> Yeah.  Just note, [0, PTRDIFF_MAX - 1] range will be only for builtins
> like memchr/strchr/mempcpy etc. that return either the passed argument or that pointer
> incremented some number of times.  Generally pointer arith can go in both
> directions, so POINTER_DIFF_EXPR can be negative too.
> If one of the pointers points into some VAR_DECL object, the range could be
> narrowed from the size of that object though.

The [0, PTRDIFF_MAX - 1] stuff I spoke of, is the code in 
adjust_pointer_diff_expr().  It is meant to be used when chasing the 
defining statement of a POINTER_DIFF_EXPR is in a specific form. 
Currently it only applies to __builtin_memchr.  I believe the code came 
from vr-values (or tree-vrp) at some point ??.

Do you think the code there should be ehanced to include other built-ins 
other than __builtin_memchr?  Patches welcome :)).

BTW, my proposed patch only deals with relations between the operands. 
It has nothing to do with builtins or other such uses.  For example, if 
both operands to a POINTER_DIFF_EXPR are equal, we can deduce that the 
result will be 0.  I frankly only care about the == and != relationship 
presently, but it was easy enough to hijack the MINUS_EXPR relation code.

Thanks.
Aldy

> char a[26];
> 
> long
> foo (long i, char *q)
> {
>    char *p = &a[0] + i;
>    return q - p;
> }
> The minimum result will be if q points to &a[0] and p to &a[26], i.e.
> -26, maximum the other way, so [-26, 26] range.
> 
> 	Jakub
>
Martin Sebor Sept. 3, 2021, 4:42 p.m. UTC | #6
On 9/3/21 8:41 AM, Aldy Hernandez via Gcc-patches wrote:
> [Andrew, do you see any problem with using the minus relational code
> here?  It seems like everything matches up.]
> 
> I've seen cases in the upcoming jump threader enhancements where we see
> a difference of two pointers that are known to be equivalent, and yet we
> fail to return 0 for the range.  This is because we have no working
> range-op entry for POINTER_DIFF_EXPR.  The entry we currently have is
> a mere placeholder to avoid ignoring POINTER_DIFF_EXPR's so
> adjust_pointer_diff_expr() could get a whack at it here:
> 
> //	def = __builtin_memchr (arg, 0, sz)
> //	n = def - arg
> //
> // The range for N can be narrowed to [0, PTRDIFF_MAX - 1].

For the two statements above a tighter bound is possible: [0, sz).

With no range info, sz can be assumed to be in [0, N], where N
is one less than the smaller of PTRDIFF_MAX and the size of arg.
(With PTRDIFF_MAX being only hypothetical.  max_object_size()
should at some return the actual limit for the current target).

The same (or even tighter) range can be derived from calls to
other functions, including like mempcpy/stpcpy, or strchr, etc.

Martin

> 
> This patch adds the relational magic to range-op, which we can just
> steal from the minus_expr code.
> 
> Testing currently in progress.  I will commit pending tests.
> 
> gcc/ChangeLog:
> 
> 	* range-op.cc (operator_minus::op1_op2_relation_effect): Abstract
> 	out to...
> 	(minus_op1_op2_relation_effect): ...here.
> 	(class operator_pointer_diff): New.
> 	(operator_pointer_diff::op1_op2_relation_effect): Call
> 	minus_op1_op2_relation_effect.
> 	(integral_table::integral_table): Add entry for POINTER_DIFF_EXPR.
> ---
>   gcc/range-op.cc | 45 ++++++++++++++++++++++++++++++++++++++-------
>   1 file changed, 38 insertions(+), 7 deletions(-)
> 
> diff --git a/gcc/range-op.cc b/gcc/range-op.cc
> index fee0e834c23..5e37133026d 100644
> --- a/gcc/range-op.cc
> +++ b/gcc/range-op.cc
> @@ -1372,13 +1372,14 @@ operator_minus::wi_fold (irange &r, tree type,
>   }
>   
>   // Check to see if the relation REL between OP1 and OP2 has any effect on the
> -// LHS of the expression.  If so, apply it to LHS_RANGE.
> +// LHS of the expression.  If so, apply it to LHS_RANGE.  This is a helper
> +// function for both MINUS_EXPR and POINTER_DIFF_EXPR.
>   
> -bool
> -operator_minus::op1_op2_relation_effect (irange &lhs_range, tree type,
> -				      const irange &op1_range ATTRIBUTE_UNUSED,
> -				      const irange &op2_range ATTRIBUTE_UNUSED,
> -				      relation_kind rel) const
> +static bool
> +minus_op1_op2_relation_effect (irange &lhs_range, tree type,
> +			       const irange &op1_range ATTRIBUTE_UNUSED,
> +			       const irange &op2_range ATTRIBUTE_UNUSED,
> +			       relation_kind rel)
>   {
>     if (rel == VREL_NONE)
>       return false;
> @@ -1440,6 +1441,16 @@ operator_minus::op1_op2_relation_effect (irange &lhs_range, tree type,
>     return true;
>   }
>   
> +bool
> +operator_minus::op1_op2_relation_effect (irange &lhs_range, tree type,
> +					 const irange &op1_range,
> +					 const irange &op2_range,
> +					 relation_kind rel) const
> +{
> +  return minus_op1_op2_relation_effect (lhs_range, type, op1_range, op2_range,
> +					rel);
> +}
> +
>   bool
>   operator_minus::op1_range (irange &r, tree type,
>   			   const irange &lhs,
> @@ -1459,6 +1470,26 @@ operator_minus::op2_range (irange &r, tree type,
>   }
>   
>   
> +class operator_pointer_diff : public range_operator
> +{
> +  virtual bool op1_op2_relation_effect (irange &lhs_range,
> +					tree type,
> +					const irange &op1_range,
> +					const irange &op2_range,
> +					relation_kind rel) const;
> +} op_pointer_diff;
> +
> +bool
> +operator_pointer_diff::op1_op2_relation_effect (irange &lhs_range, tree type,
> +						const irange &op1_range,
> +						const irange &op2_range,
> +						relation_kind rel) const
> +{
> +  return minus_op1_op2_relation_effect (lhs_range, type, op1_range, op2_range,
> +					rel);
> +}
> +
> +
>   class operator_min : public range_operator
>   {
>   public:
> @@ -4018,7 +4049,7 @@ integral_table::integral_table ()
>     set (OBJ_TYPE_REF, op_identity);
>     set (IMAGPART_EXPR, op_unknown);
>     set (REALPART_EXPR, op_unknown);
> -  set (POINTER_DIFF_EXPR, op_unknown);
> +  set (POINTER_DIFF_EXPR, op_pointer_diff);
>     set (ABS_EXPR, op_abs);
>     set (ABSU_EXPR, op_absu);
>     set (NEGATE_EXPR, op_negate);
>
Aldy Hernandez Sept. 3, 2021, 4:47 p.m. UTC | #7
Patches welcome :-).

On Fri, Sep 3, 2021, 18:42 Martin Sebor <msebor@gmail.com> wrote:

> On 9/3/21 8:41 AM, Aldy Hernandez via Gcc-patches wrote:
> > [Andrew, do you see any problem with using the minus relational code
> > here?  It seems like everything matches up.]
> >
> > I've seen cases in the upcoming jump threader enhancements where we see
> > a difference of two pointers that are known to be equivalent, and yet we
> > fail to return 0 for the range.  This is because we have no working
> > range-op entry for POINTER_DIFF_EXPR.  The entry we currently have is
> > a mere placeholder to avoid ignoring POINTER_DIFF_EXPR's so
> > adjust_pointer_diff_expr() could get a whack at it here:
> >
> > //    def = __builtin_memchr (arg, 0, sz)
> > //    n = def - arg
> > //
> > // The range for N can be narrowed to [0, PTRDIFF_MAX - 1].
>
> For the two statements above a tighter bound is possible: [0, sz).
>
> With no range info, sz can be assumed to be in [0, N], where N
> is one less than the smaller of PTRDIFF_MAX and the size of arg.
> (With PTRDIFF_MAX being only hypothetical.  max_object_size()
> should at some return the actual limit for the current target).
>
> The same (or even tighter) range can be derived from calls to
> other functions, including like mempcpy/stpcpy, or strchr, etc.
>
> Martin
>
> >
> > This patch adds the relational magic to range-op, which we can just
> > steal from the minus_expr code.
> >
> > Testing currently in progress.  I will commit pending tests.
> >
> > gcc/ChangeLog:
> >
> >       * range-op.cc (operator_minus::op1_op2_relation_effect): Abstract
> >       out to...
> >       (minus_op1_op2_relation_effect): ...here.
> >       (class operator_pointer_diff): New.
> >       (operator_pointer_diff::op1_op2_relation_effect): Call
> >       minus_op1_op2_relation_effect.
> >       (integral_table::integral_table): Add entry for POINTER_DIFF_EXPR.
> > ---
> >   gcc/range-op.cc | 45 ++++++++++++++++++++++++++++++++++++++-------
> >   1 file changed, 38 insertions(+), 7 deletions(-)
> >
> > diff --git a/gcc/range-op.cc b/gcc/range-op.cc
> > index fee0e834c23..5e37133026d 100644
> > --- a/gcc/range-op.cc
> > +++ b/gcc/range-op.cc
> > @@ -1372,13 +1372,14 @@ operator_minus::wi_fold (irange &r, tree type,
> >   }
> >
> >   // Check to see if the relation REL between OP1 and OP2 has any effect
> on the
> > -// LHS of the expression.  If so, apply it to LHS_RANGE.
> > +// LHS of the expression.  If so, apply it to LHS_RANGE.  This is a
> helper
> > +// function for both MINUS_EXPR and POINTER_DIFF_EXPR.
> >
> > -bool
> > -operator_minus::op1_op2_relation_effect (irange &lhs_range, tree type,
> > -                                   const irange &op1_range
> ATTRIBUTE_UNUSED,
> > -                                   const irange &op2_range
> ATTRIBUTE_UNUSED,
> > -                                   relation_kind rel) const
> > +static bool
> > +minus_op1_op2_relation_effect (irange &lhs_range, tree type,
> > +                            const irange &op1_range ATTRIBUTE_UNUSED,
> > +                            const irange &op2_range ATTRIBUTE_UNUSED,
> > +                            relation_kind rel)
> >   {
> >     if (rel == VREL_NONE)
> >       return false;
> > @@ -1440,6 +1441,16 @@ operator_minus::op1_op2_relation_effect (irange
> &lhs_range, tree type,
> >     return true;
> >   }
> >
> > +bool
> > +operator_minus::op1_op2_relation_effect (irange &lhs_range, tree type,
> > +                                      const irange &op1_range,
> > +                                      const irange &op2_range,
> > +                                      relation_kind rel) const
> > +{
> > +  return minus_op1_op2_relation_effect (lhs_range, type, op1_range,
> op2_range,
> > +                                     rel);
> > +}
> > +
> >   bool
> >   operator_minus::op1_range (irange &r, tree type,
> >                          const irange &lhs,
> > @@ -1459,6 +1470,26 @@ operator_minus::op2_range (irange &r, tree type,
> >   }
> >
> >
> > +class operator_pointer_diff : public range_operator
> > +{
> > +  virtual bool op1_op2_relation_effect (irange &lhs_range,
> > +                                     tree type,
> > +                                     const irange &op1_range,
> > +                                     const irange &op2_range,
> > +                                     relation_kind rel) const;
> > +} op_pointer_diff;
> > +
> > +bool
> > +operator_pointer_diff::op1_op2_relation_effect (irange &lhs_range, tree
> type,
> > +                                             const irange &op1_range,
> > +                                             const irange &op2_range,
> > +                                             relation_kind rel) const
> > +{
> > +  return minus_op1_op2_relation_effect (lhs_range, type, op1_range,
> op2_range,
> > +                                     rel);
> > +}
> > +
> > +
> >   class operator_min : public range_operator
> >   {
> >   public:
> > @@ -4018,7 +4049,7 @@ integral_table::integral_table ()
> >     set (OBJ_TYPE_REF, op_identity);
> >     set (IMAGPART_EXPR, op_unknown);
> >     set (REALPART_EXPR, op_unknown);
> > -  set (POINTER_DIFF_EXPR, op_unknown);
> > +  set (POINTER_DIFF_EXPR, op_pointer_diff);
> >     set (ABS_EXPR, op_abs);
> >     set (ABSU_EXPR, op_absu);
> >     set (NEGATE_EXPR, op_negate);
> >
>
>
diff mbox series

Patch

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index fee0e834c23..5e37133026d 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -1372,13 +1372,14 @@  operator_minus::wi_fold (irange &r, tree type,
 }
 
 // Check to see if the relation REL between OP1 and OP2 has any effect on the
-// LHS of the expression.  If so, apply it to LHS_RANGE.
+// LHS of the expression.  If so, apply it to LHS_RANGE.  This is a helper
+// function for both MINUS_EXPR and POINTER_DIFF_EXPR.
 
-bool
-operator_minus::op1_op2_relation_effect (irange &lhs_range, tree type,
-				      const irange &op1_range ATTRIBUTE_UNUSED,
-				      const irange &op2_range ATTRIBUTE_UNUSED,
-				      relation_kind rel) const
+static bool
+minus_op1_op2_relation_effect (irange &lhs_range, tree type,
+			       const irange &op1_range ATTRIBUTE_UNUSED,
+			       const irange &op2_range ATTRIBUTE_UNUSED,
+			       relation_kind rel)
 {
   if (rel == VREL_NONE)
     return false;
@@ -1440,6 +1441,16 @@  operator_minus::op1_op2_relation_effect (irange &lhs_range, tree type,
   return true;
 }
 
+bool
+operator_minus::op1_op2_relation_effect (irange &lhs_range, tree type,
+					 const irange &op1_range,
+					 const irange &op2_range,
+					 relation_kind rel) const
+{
+  return minus_op1_op2_relation_effect (lhs_range, type, op1_range, op2_range,
+					rel);
+}
+
 bool
 operator_minus::op1_range (irange &r, tree type,
 			   const irange &lhs,
@@ -1459,6 +1470,26 @@  operator_minus::op2_range (irange &r, tree type,
 }
 
 
+class operator_pointer_diff : public range_operator
+{
+  virtual bool op1_op2_relation_effect (irange &lhs_range,
+					tree type,
+					const irange &op1_range,
+					const irange &op2_range,
+					relation_kind rel) const;
+} op_pointer_diff;
+
+bool
+operator_pointer_diff::op1_op2_relation_effect (irange &lhs_range, tree type,
+						const irange &op1_range,
+						const irange &op2_range,
+						relation_kind rel) const
+{
+  return minus_op1_op2_relation_effect (lhs_range, type, op1_range, op2_range,
+					rel);
+}
+
+
 class operator_min : public range_operator
 {
 public:
@@ -4018,7 +4049,7 @@  integral_table::integral_table ()
   set (OBJ_TYPE_REF, op_identity);
   set (IMAGPART_EXPR, op_unknown);
   set (REALPART_EXPR, op_unknown);
-  set (POINTER_DIFF_EXPR, op_unknown);
+  set (POINTER_DIFF_EXPR, op_pointer_diff);
   set (ABS_EXPR, op_abs);
   set (ABSU_EXPR, op_absu);
   set (NEGATE_EXPR, op_negate);