diff mbox series

c++, match.pd: Evaluate in constant evaluation comparisons like &var1 + 12 == &var2 + 24 [PR89074]

Message ID 20220107120639.GK2646553@tucnak
State New
Headers show
Series c++, match.pd: Evaluate in constant evaluation comparisons like &var1 + 12 == &var2 + 24 [PR89074] | expand

Commit Message

Jakub Jelinek Jan. 7, 2022, 12:06 p.m. UTC
Hi!

The match.pd address_comparison simplification can only handle
ADDR_EXPR comparisons possibly converted to some other type (I wonder
if we shouldn't restrict it in address_compare to casts to pointer
types or pointer-sized integer types, I think we shouldn't optimize
(short) (&var) == (short) (&var2) because we really don't know whether
it will be true or false).  On GIMPLE, most of pointer to pointer
casts are useless and optimized away and further we have in
gimple_fold_stmt_to_constant_1 an optimization that folds
&something p+ const_int
into
&MEM_REF[..., off]
On GENERIC, we don't do that and e.g. for constant evaluation it
could be pretty harmful if e.g. such pointers are dereferenced, because
it can lose what exact field it was starting with etc., all it knows
is the base and offset, type and alias set.
Instead of teaching the match.pd address_compare about 3 extra variants
where one or both compared operands are pointer_plus, this patch attempts
to fold operands of comparisons similarly to gimple_fold_stmt_to_constant_1
before calling fold_binary on it.
There is another thing though, while we do have (x p+ y) p+ z to
x p+ (y + z) simplification which works on GIMPLE well because of the
useless pointer conversions, on GENERIC we can have pointer casts in between
and at that point we can end up with large expressions like
((type3) (((type2) ((type1) (&var + 2) + 2) + 2) + 2))
etc.  Pointer-plus doesn't really care what exact pointer type it has as
long as it is a pointer, so the following match.pd simplification for
GENERIC only (it is useless for GIMPLE) also moves the cast so that nested
p+ can be simplified.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Note, I've noticed we don't really diagnose going out of bounds with
pointer_plus (unlike e.g. with ARRAY_REF) during constant evaluation, I
think another patch for cxx_eval_binary_expression with POINTER_PLUS will be
needed.  But it isn't clear to me what exactly it should do in case of
subobjects.  If we start with address of a whole var, (&var), I guess we
should diagnose if the pointer_plus gets before start of the var (i.e.
"negative") or 1 byte past the end of the var, but what if we start with
&var.field or &var.field[3] ?  For &var.field, shall we diagnose out of
bounds of field (except perhaps flexible members?) or the whole var?
For ARRAY_REFs, I assume we must at least strip all the outer ARRAY_REFs
and so start with &var.field too, right?

2022-01-07  Jakub Jelinek  <jakub@redhat.com>

	PR c++/89074
gcc/
	* match.pd ((ptr) (x p+ y) p+ z -> (ptr) (x p+ (y + z))): New GENERIC
	simplification.
gcc/cp/
	* constexpr.c (cxx_maybe_fold_addr_pointer_plus): New function.
	(cxx_eval_binary_expression): Use it.
gcc/testsuite/
	* g++.dg/cpp1y/constexpr-89074-2.C: New test.
	* g++.dg/cpp1z/constexpr-89074-1.C: New test.


	Jakub

Comments

Jason Merrill Jan. 8, 2022, 6:08 a.m. UTC | #1
On 1/7/22 07:06, Jakub Jelinek wrote:
> Hi!
> 
> The match.pd address_comparison simplification can only handle
> ADDR_EXPR comparisons possibly converted to some other type (I wonder
> if we shouldn't restrict it in address_compare to casts to pointer
> types or pointer-sized integer types, I think we shouldn't optimize
> (short) (&var) == (short) (&var2) because we really don't know whether
> it will be true or false).  On GIMPLE, most of pointer to pointer
> casts are useless and optimized away and further we have in
> gimple_fold_stmt_to_constant_1 an optimization that folds
> &something p+ const_int
> into
> &MEM_REF[..., off]
> On GENERIC, we don't do that and e.g. for constant evaluation it
> could be pretty harmful if e.g. such pointers are dereferenced, because
> it can lose what exact field it was starting with etc., all it knows
> is the base and offset, type and alias set.
> Instead of teaching the match.pd address_compare about 3 extra variants
> where one or both compared operands are pointer_plus, this patch attempts
> to fold operands of comparisons similarly to gimple_fold_stmt_to_constant_1
> before calling fold_binary on it.
> There is another thing though, while we do have (x p+ y) p+ z to
> x p+ (y + z) simplification which works on GIMPLE well because of the
> useless pointer conversions, on GENERIC we can have pointer casts in between
> and at that point we can end up with large expressions like
> ((type3) (((type2) ((type1) (&var + 2) + 2) + 2) + 2))
> etc.  Pointer-plus doesn't really care what exact pointer type it has as
> long as it is a pointer, so the following match.pd simplification for
> GENERIC only (it is useless for GIMPLE) also moves the cast so that nested
> p+ can be simplified.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

LGTM.

> Note, I've noticed we don't really diagnose going out of bounds with
> pointer_plus (unlike e.g. with ARRAY_REF) during constant evaluation, I
> think another patch for cxx_eval_binary_expression with POINTER_PLUS will be
> needed.  But it isn't clear to me what exactly it should do in case of
> subobjects.  If we start with address of a whole var, (&var), I guess we
> should diagnose if the pointer_plus gets before start of the var (i.e.
> "negative") or 1 byte past the end of the var, but what if we start with
> &var.field or &var.field[3] ?  For &var.field, shall we diagnose out of
> bounds of field (except perhaps flexible members? or the whole var?
The field.  And a flexible member has unknown bounds.

> For ARRAY_REFs, I assume we must at least strip all the outer ARRAY_REFs
> and so start with &var.field too, right?

A strict reading suggests that we should complain about going outside 
the bounds of the inner array, but flattening multidimensional arrays as 
you suggest seems reasonable as well.

> 2022-01-07  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR c++/89074
> gcc/
> 	* match.pd ((ptr) (x p+ y) p+ z -> (ptr) (x p+ (y + z))): New GENERIC
> 	simplification.
> gcc/cp/
> 	* constexpr.c (cxx_maybe_fold_addr_pointer_plus): New function.
> 	(cxx_eval_binary_expression): Use it.
> gcc/testsuite/
> 	* g++.dg/cpp1y/constexpr-89074-2.C: New test.
> 	* g++.dg/cpp1z/constexpr-89074-1.C: New test.
> 
> --- gcc/match.pd.jj	2022-01-05 20:30:08.768806236 +0100
> +++ gcc/match.pd	2022-01-06 19:59:53.596114417 +0100
> @@ -2143,6 +2143,11 @@ (define_operator_list SYNC_FETCH_AND_AND
>   (simplify
>     (pointer_plus (pointer_plus:s @0 @1) @3)
>     (pointer_plus @0 (plus @1 @3)))
> +#if GENERIC
> +(simplify
> +  (pointer_plus (convert:s (pointer_plus:s @0 @1)) @3)
> +  (convert:type (pointer_plus @0 (plus @1 @3))))
> +#endif
>   
>   /* Pattern match
>        tem1 = (long) ptr1;
> --- gcc/cp/constexpr.c.jj	2022-01-03 10:40:48.403063535 +0100
> +++ gcc/cp/constexpr.c	2022-01-06 20:47:44.596623219 +0100
> @@ -3288,6 +3288,38 @@ cxx_fold_pointer_plus_expression (const
>     return NULL_TREE;
>   }
>   
> +/* Try to fold expressions like
> +   (struct S *) (&a[0].D.2378 + 12)
> +   into
> +   &MEM <struct T> [(void *)&a + 12B]
> +   This is something normally done by gimple_fold_stmt_to_constant_1
> +   on GIMPLE, but is undesirable on GENERIC if we are e.g. going to
> +   dereference the address because some details are lost.
> +   For pointer comparisons we want such folding though so that
> +   match.pd address_compare optimization works.  */
> +
> +static tree
> +cxx_maybe_fold_addr_pointer_plus (tree t)
> +{
> +  while (CONVERT_EXPR_P (t)
> +	 && POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (t, 0))))
> +    t = TREE_OPERAND (t, 0);
> +  if (TREE_CODE (t) != POINTER_PLUS_EXPR)
> +    return NULL_TREE;
> +  tree op0 = TREE_OPERAND (t, 0);
> +  tree op1 = TREE_OPERAND (t, 1);
> +  if (TREE_CODE (op1) != INTEGER_CST)
> +    return NULL_TREE;
> +  while (CONVERT_EXPR_P (op0)
> +	 && POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (op0, 0))))
> +    op0 = TREE_OPERAND (op0, 0);
> +  if (TREE_CODE (op0) != ADDR_EXPR)
> +    return NULL_TREE;
> +  op1 = fold_convert (ptr_type_node, op1);
> +  tree r = fold_build2 (MEM_REF, TREE_TYPE (TREE_TYPE (op0)), op0, op1);
> +  return build1_loc (EXPR_LOCATION (t), ADDR_EXPR, TREE_TYPE (op0), r);
> +}
> +
>   /* Subroutine of cxx_eval_constant_expression.
>      Like cxx_eval_unary_expression, except for binary expressions.  */
>   
> @@ -3347,6 +3379,15 @@ cxx_eval_binary_expression (const conste
>         else if (TREE_CODE (rhs) == PTRMEM_CST)
>   	rhs = cplus_expand_constant (rhs);
>       }
> +  if (r == NULL_TREE
> +      && TREE_CODE_CLASS (code) == tcc_comparison
> +      && POINTER_TYPE_P (TREE_TYPE (lhs)))
> +    {
> +      if (tree lhso = cxx_maybe_fold_addr_pointer_plus (lhs))
> +	lhs = fold_convert (TREE_TYPE (lhs), lhso);
> +      if (tree rhso = cxx_maybe_fold_addr_pointer_plus (rhs))
> +	rhs = fold_convert (TREE_TYPE (rhs), rhso);
> +    }
>     if (code == POINTER_PLUS_EXPR && !*non_constant_p
>         && integer_zerop (lhs) && !integer_zerop (rhs))
>       {
> --- gcc/testsuite/g++.dg/cpp1y/constexpr-89074-2.C.jj	2022-01-06 20:51:52.327080068 +0100
> +++ gcc/testsuite/g++.dg/cpp1y/constexpr-89074-2.C	2022-01-06 20:51:18.338566365 +0100
> @@ -0,0 +1,19 @@
> +// PR c++/89074
> +// { dg-do compile { target c++14 } }
> +
> +constexpr bool
> +foo ()
> +{
> +  int a[] = { 1, 2 };
> +  int b[] = { 3, 4 };
> +
> +  if (a + 0 == b + 0)
> +    return false;
> +
> +  if (a + 1 == b + 0)
> +    return false;
> +
> +  return true;
> +}
> +
> +static_assert (foo (), "");
> --- gcc/testsuite/g++.dg/cpp1z/constexpr-89074-1.C.jj	2022-01-06 20:55:33.204919807 +0100
> +++ gcc/testsuite/g++.dg/cpp1z/constexpr-89074-1.C	2022-01-06 20:55:12.566215101 +0100
> @@ -0,0 +1,26 @@
> +// PR c++/89074
> +// { dg-do compile { target c++17 } }
> +
> +struct S { int s; };
> +struct T : public S { };
> +struct U : public T { };
> +
> +constexpr bool
> +foo ()
> +{
> +  U a[] = { 1, 2, 3, 4 };
> +  U b[] = { 5, 6, 7, 8 };
> +  T *c = (T *) a + 1;
> +  S *d = (S *) c + 2;
> +  S *e = (S *) b + 1;
> +
> +  if (a + 0 == b + 0)
> +    return false;
> +
> +  if (d == e)
> +    return false;
> +
> +  return true;
> +}
> +
> +static_assert (foo (), "");
> 
> 	Jakub
>
diff mbox series

Patch

--- gcc/match.pd.jj	2022-01-05 20:30:08.768806236 +0100
+++ gcc/match.pd	2022-01-06 19:59:53.596114417 +0100
@@ -2143,6 +2143,11 @@  (define_operator_list SYNC_FETCH_AND_AND
 (simplify
   (pointer_plus (pointer_plus:s @0 @1) @3)
   (pointer_plus @0 (plus @1 @3)))
+#if GENERIC
+(simplify
+  (pointer_plus (convert:s (pointer_plus:s @0 @1)) @3)
+  (convert:type (pointer_plus @0 (plus @1 @3))))
+#endif
 
 /* Pattern match
      tem1 = (long) ptr1;
--- gcc/cp/constexpr.c.jj	2022-01-03 10:40:48.403063535 +0100
+++ gcc/cp/constexpr.c	2022-01-06 20:47:44.596623219 +0100
@@ -3288,6 +3288,38 @@  cxx_fold_pointer_plus_expression (const
   return NULL_TREE;
 }
 
+/* Try to fold expressions like
+   (struct S *) (&a[0].D.2378 + 12)
+   into
+   &MEM <struct T> [(void *)&a + 12B]
+   This is something normally done by gimple_fold_stmt_to_constant_1
+   on GIMPLE, but is undesirable on GENERIC if we are e.g. going to
+   dereference the address because some details are lost.
+   For pointer comparisons we want such folding though so that
+   match.pd address_compare optimization works.  */
+
+static tree
+cxx_maybe_fold_addr_pointer_plus (tree t)
+{
+  while (CONVERT_EXPR_P (t)
+	 && POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (t, 0))))
+    t = TREE_OPERAND (t, 0);
+  if (TREE_CODE (t) != POINTER_PLUS_EXPR)
+    return NULL_TREE;
+  tree op0 = TREE_OPERAND (t, 0);
+  tree op1 = TREE_OPERAND (t, 1);
+  if (TREE_CODE (op1) != INTEGER_CST)
+    return NULL_TREE;
+  while (CONVERT_EXPR_P (op0)
+	 && POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (op0, 0))))
+    op0 = TREE_OPERAND (op0, 0);
+  if (TREE_CODE (op0) != ADDR_EXPR)
+    return NULL_TREE;
+  op1 = fold_convert (ptr_type_node, op1);
+  tree r = fold_build2 (MEM_REF, TREE_TYPE (TREE_TYPE (op0)), op0, op1);
+  return build1_loc (EXPR_LOCATION (t), ADDR_EXPR, TREE_TYPE (op0), r);
+}
+
 /* Subroutine of cxx_eval_constant_expression.
    Like cxx_eval_unary_expression, except for binary expressions.  */
 
@@ -3347,6 +3379,15 @@  cxx_eval_binary_expression (const conste
       else if (TREE_CODE (rhs) == PTRMEM_CST)
 	rhs = cplus_expand_constant (rhs);
     }
+  if (r == NULL_TREE
+      && TREE_CODE_CLASS (code) == tcc_comparison
+      && POINTER_TYPE_P (TREE_TYPE (lhs)))
+    {
+      if (tree lhso = cxx_maybe_fold_addr_pointer_plus (lhs))
+	lhs = fold_convert (TREE_TYPE (lhs), lhso);
+      if (tree rhso = cxx_maybe_fold_addr_pointer_plus (rhs))
+	rhs = fold_convert (TREE_TYPE (rhs), rhso);
+    }
   if (code == POINTER_PLUS_EXPR && !*non_constant_p
       && integer_zerop (lhs) && !integer_zerop (rhs))
     {
--- gcc/testsuite/g++.dg/cpp1y/constexpr-89074-2.C.jj	2022-01-06 20:51:52.327080068 +0100
+++ gcc/testsuite/g++.dg/cpp1y/constexpr-89074-2.C	2022-01-06 20:51:18.338566365 +0100
@@ -0,0 +1,19 @@ 
+// PR c++/89074
+// { dg-do compile { target c++14 } }
+
+constexpr bool
+foo ()
+{
+  int a[] = { 1, 2 };
+  int b[] = { 3, 4 };
+
+  if (a + 0 == b + 0)
+    return false;
+
+  if (a + 1 == b + 0)
+    return false;
+
+  return true;
+}
+
+static_assert (foo (), "");
--- gcc/testsuite/g++.dg/cpp1z/constexpr-89074-1.C.jj	2022-01-06 20:55:33.204919807 +0100
+++ gcc/testsuite/g++.dg/cpp1z/constexpr-89074-1.C	2022-01-06 20:55:12.566215101 +0100
@@ -0,0 +1,26 @@ 
+// PR c++/89074
+// { dg-do compile { target c++17 } }
+
+struct S { int s; };
+struct T : public S { };
+struct U : public T { };
+
+constexpr bool
+foo ()
+{
+  U a[] = { 1, 2, 3, 4 };
+  U b[] = { 5, 6, 7, 8 };
+  T *c = (T *) a + 1;
+  S *d = (S *) c + 2;
+  S *e = (S *) b + 1;
+
+  if (a + 0 == b + 0)
+    return false;
+
+  if (d == e)
+    return false;
+
+  return true;
+}
+
+static_assert (foo (), "");