diff mbox series

sccvn: Punt on overflows during vn_reference_lookup_3

Message ID 20200227085817.GB2155@tucnak
State New
Headers show
Series sccvn: Punt on overflows during vn_reference_lookup_3 | expand

Commit Message

Jakub Jelinek Feb. 27, 2020, 8:58 a.m. UTC
Hi!

I admit I don't have testcases, but I'm afraid very bad things will happen
if either the offset2i - offseti or pd.offset + pd.size computations
overflow.  All are computed in (signed) HOST_WIDE_INT, and at least for the
memset or CONSTRUCTOR cases I'd fear the stores could be extremely large.

Or shall we introduce some system.h macros for this, say
HWI_SADD_OVERFLOW_P and HWI_SSUB_OVERFLOW_P that could use
__builtin_{add,sub}_overflow_p when available or do those
comparison checks and use those here?

Bootstrapped/regtested on x86_64-linux and i686-linux.

2020-02-27  Jakub Jelinek  <jakub@redhat.com>

	* tree-ssa-sccvn.c (vn_walk_cb_data::push_partial_def): Punt if
	pd.offset + pd.size would overflow.
	(vn_reference_lookup_3): Punt if offset2i - offseti would overflow.


	Jakub

Comments

Richard Biener Feb. 27, 2020, 9:30 a.m. UTC | #1
On Thu, 27 Feb 2020, Jakub Jelinek wrote:

> Hi!
> 
> I admit I don't have testcases, but I'm afraid very bad things will happen
> if either the offset2i - offseti or pd.offset + pd.size computations
> overflow.  All are computed in (signed) HOST_WIDE_INT, and at least for the
> memset or CONSTRUCTOR cases I'd fear the stores could be extremely large.
> 
> Or shall we introduce some system.h macros for this, say
> HWI_SADD_OVERFLOW_P and HWI_SSUB_OVERFLOW_P that could use
> __builtin_{add,sub}_overflow_p when available or do those
> comparison checks and use those here?
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux.

Obviously I don't like the repetitive boiler-plate after
the ranges_known_overlap_p checks so I wonder if we can at least
factor those into a VN-local ranges_known_overlap_for_pd_p predicate.

Wouldn't it be possible to simply constrain both sizes to half
of the address space?  Then the ranges_known_overlap_p should
guarantee that we can represent the difference between the
offsets in a signed HWI?

Richard.

> 2020-02-27  Jakub Jelinek  <jakub@redhat.com>
> 
> 	* tree-ssa-sccvn.c (vn_walk_cb_data::push_partial_def): Punt if
> 	pd.offset + pd.size would overflow.
> 	(vn_reference_lookup_3): Punt if offset2i - offseti would overflow.
> 
> --- gcc/tree-ssa-sccvn.c.jj	2020-02-26 13:38:01.899937521 +0100
> +++ gcc/tree-ssa-sccvn.c	2020-02-26 14:39:26.472695329 +0100
> @@ -1778,7 +1778,9 @@ vn_walk_cb_data::push_partial_def (const
>        || CHAR_BIT != 8
>        || BITS_PER_UNIT != 8
>        /* Not prepared to handle PDP endian.  */
> -      || BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN)
> +      || BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN
> +      /* Punt on overflows during pd.offset + pd.size computation.  */
> +      || pd.offset > INTTYPE_MAXIMUM (HOST_WIDE_INT) - pd.size)
>      return (void *)-1;
>  
>    bool pd_constant_p = (TREE_CODE (pd.rhs) == CONSTRUCTOR
> @@ -2680,7 +2682,13 @@ vn_reference_lookup_3 (ao_ref *ref, tree
>  	       && offset2.is_constant (&offset2i)
>  	       && maxsize.is_constant (&maxsizei)
>  	       && ranges_known_overlap_p (offseti, maxsizei, offset2i,
> -					  leni << LOG2_BITS_PER_UNIT))
> +					  leni << LOG2_BITS_PER_UNIT)
> +	       /* Punt on overflows.  */
> +	       && !((offseti > 0
> +		     && offset2i < INTTYPE_MINIMUM (HOST_WIDE_INT) + offseti)
> +		    || (offseti < 0
> +			&& offset2i > (INTTYPE_MAXIMUM (HOST_WIDE_INT)
> +				       + offseti))))
>  	{
>  	  pd_data pd;
>  	  pd.rhs = build_constructor (NULL_TREE, NULL);
> @@ -2733,7 +2741,14 @@ vn_reference_lookup_3 (ao_ref *ref, tree
>  		   && offset2.is_constant (&offset2i)
>  		   && size2.is_constant (&size2i)
>  		   && ranges_known_overlap_p (offseti, maxsizei,
> -					      offset2i, size2i))
> +					      offset2i, size2i)
> +		   /* Punt on overflows.  */
> +		   && !((offseti > 0
> +			 && offset2i < (INTTYPE_MINIMUM (HOST_WIDE_INT)
> +					+ offseti))
> +			|| (offseti < 0
> +			    && offset2i > (INTTYPE_MAXIMUM (HOST_WIDE_INT)
> +					   + offseti))))
>  	    {
>  	      /* Let clobbers be consumed by the partial-def tracker
>  	         which can choose to ignore them if they are shadowed
> @@ -2886,7 +2901,14 @@ vn_reference_lookup_3 (ao_ref *ref, tree
>  		}
>  	    }
>  	  else if (ranges_known_overlap_p (offseti, maxsizei, offset2i,
> -					   size2i))
> +					   size2i)
> +		   /* Punt on overflows.  */
> +		   && !((offseti > 0
> +			 && offset2i < (INTTYPE_MINIMUM (HOST_WIDE_INT)
> +					+ offseti))
> +			|| (offseti < 0
> +			    && offset2i > (INTTYPE_MAXIMUM (HOST_WIDE_INT)
> +					   + offseti))))
>  	    {
>  	      pd_data pd;
>  	      tree rhs = gimple_assign_rhs1 (def_stmt);
> @@ -2969,7 +2991,14 @@ vn_reference_lookup_3 (ao_ref *ref, tree
>  		   && offset.is_constant (&offseti)
>  		   && offset2.is_constant (&offset2i)
>  		   && size2.is_constant (&size2i)
> -		   && ranges_known_overlap_p (offset, maxsize, offset2, size2))
> +		   && ranges_known_overlap_p (offset, maxsize, offset2, size2)
> +		   /* Punt on overflows.  */
> +		   && !((offseti > 0
> +			 && offset2i < (INTTYPE_MINIMUM (HOST_WIDE_INT)
> +					+ offseti))
> +			|| (offseti < 0
> +			    && offset2i > (INTTYPE_MAXIMUM (HOST_WIDE_INT)
> +					   + offseti))))
>  	    {
>  	      pd_data pd;
>  	      pd.rhs = SSA_VAL (def_rhs);
> 
> 	Jakub
> 
>
Jakub Jelinek Feb. 27, 2020, 9:44 a.m. UTC | #2
On Thu, Feb 27, 2020 at 10:30:21AM +0100, Richard Biener wrote:
> Obviously I don't like the repetitive boiler-plate after
> the ranges_known_overlap_p checks so I wonder if we can at least
> factor those into a VN-local ranges_known_overlap_for_pd_p predicate.

I can do that.

> Wouldn't it be possible to simply constrain both sizes to half
> of the address space?  Then the ranges_known_overlap_p should
> guarantee that we can represent the difference between the
> offsets in a signed HWI?

Well, it is already 1/8th of address space for 64-bit VA, so that would
be 1/16th then.  Doing it in ranges_known_overlap_p might be too
restrictive, other places might handle those fine.

Perhaps we should also rule out the case when pd.offset would be minimum,
because we e.g. use -pd.offset, or when it or pd.size would be maximum
(as we e.g. use r->size + 1).

I'm also a little bit worried about possible overflows in
      r->size = MAX (r->offset + r->size, newr.offset + newr.size) - r->offset;
or
      r->size = MAX (r->offset + r->size,
                     rafter->offset + rafter->size) - r->offset;
While in ranges_known_overlap_for_pd_p we'd ensure that *->offset + *->size
doesn't overflow, the subtraction still could.

	Jakub
Richard Biener Feb. 27, 2020, 10:06 a.m. UTC | #3
On Thu, 27 Feb 2020, Jakub Jelinek wrote:

> On Thu, Feb 27, 2020 at 10:30:21AM +0100, Richard Biener wrote:
> > Obviously I don't like the repetitive boiler-plate after
> > the ranges_known_overlap_p checks so I wonder if we can at least
> > factor those into a VN-local ranges_known_overlap_for_pd_p predicate.
> 
> I can do that.

OK.

> > Wouldn't it be possible to simply constrain both sizes to half
> > of the address space?  Then the ranges_known_overlap_p should
> > guarantee that we can represent the difference between the
> > offsets in a signed HWI?
> 
> Well, it is already 1/8th of address space for 64-bit VA, so that would
> be 1/16th then.  Doing it in ranges_known_overlap_p might be too
> restrictive, other places might handle those fine.
> 
> Perhaps we should also rule out the case when pd.offset would be minimum,
> because we e.g. use -pd.offset, or when it or pd.size would be maximum
> (as we e.g. use r->size + 1).
> 
> I'm also a little bit worried about possible overflows in
>       r->size = MAX (r->offset + r->size, newr.offset + newr.size) - r->offset;
> or
>       r->size = MAX (r->offset + r->size,
>                      rafter->offset + rafter->size) - r->offset;
> While in ranges_known_overlap_for_pd_p we'd ensure that *->offset + *->size
> doesn't overflow, the subtraction still could.

Hmm, maybe.  So basically

memset (p - large, 0, even-larger);  [p - large, p + 1]
memset (p, 0, large);          [p, p + large]

and a read [p, p + 1] then we'll have very small (negative) r->offset
and rafter->offset + rafter->size will be very large and we'll get
up to SHWI_MAX - SWHI_MIN here.

Now the thing here would be to note that the [p, p + 1] lookup
is constraining what we need to track and we could prune partial-defs
easily (the uniform ones).  Of course that's extra code with possible
sources of bugs.  Or we'll bite the bullet and use widest_int
(or offset_int?) for pd_range/pd_data offset/size pairs.

Richard.
diff mbox series

Patch

--- gcc/tree-ssa-sccvn.c.jj	2020-02-26 13:38:01.899937521 +0100
+++ gcc/tree-ssa-sccvn.c	2020-02-26 14:39:26.472695329 +0100
@@ -1778,7 +1778,9 @@  vn_walk_cb_data::push_partial_def (const
       || CHAR_BIT != 8
       || BITS_PER_UNIT != 8
       /* Not prepared to handle PDP endian.  */
-      || BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN)
+      || BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN
+      /* Punt on overflows during pd.offset + pd.size computation.  */
+      || pd.offset > INTTYPE_MAXIMUM (HOST_WIDE_INT) - pd.size)
     return (void *)-1;
 
   bool pd_constant_p = (TREE_CODE (pd.rhs) == CONSTRUCTOR
@@ -2680,7 +2682,13 @@  vn_reference_lookup_3 (ao_ref *ref, tree
 	       && offset2.is_constant (&offset2i)
 	       && maxsize.is_constant (&maxsizei)
 	       && ranges_known_overlap_p (offseti, maxsizei, offset2i,
-					  leni << LOG2_BITS_PER_UNIT))
+					  leni << LOG2_BITS_PER_UNIT)
+	       /* Punt on overflows.  */
+	       && !((offseti > 0
+		     && offset2i < INTTYPE_MINIMUM (HOST_WIDE_INT) + offseti)
+		    || (offseti < 0
+			&& offset2i > (INTTYPE_MAXIMUM (HOST_WIDE_INT)
+				       + offseti))))
 	{
 	  pd_data pd;
 	  pd.rhs = build_constructor (NULL_TREE, NULL);
@@ -2733,7 +2741,14 @@  vn_reference_lookup_3 (ao_ref *ref, tree
 		   && offset2.is_constant (&offset2i)
 		   && size2.is_constant (&size2i)
 		   && ranges_known_overlap_p (offseti, maxsizei,
-					      offset2i, size2i))
+					      offset2i, size2i)
+		   /* Punt on overflows.  */
+		   && !((offseti > 0
+			 && offset2i < (INTTYPE_MINIMUM (HOST_WIDE_INT)
+					+ offseti))
+			|| (offseti < 0
+			    && offset2i > (INTTYPE_MAXIMUM (HOST_WIDE_INT)
+					   + offseti))))
 	    {
 	      /* Let clobbers be consumed by the partial-def tracker
 	         which can choose to ignore them if they are shadowed
@@ -2886,7 +2901,14 @@  vn_reference_lookup_3 (ao_ref *ref, tree
 		}
 	    }
 	  else if (ranges_known_overlap_p (offseti, maxsizei, offset2i,
-					   size2i))
+					   size2i)
+		   /* Punt on overflows.  */
+		   && !((offseti > 0
+			 && offset2i < (INTTYPE_MINIMUM (HOST_WIDE_INT)
+					+ offseti))
+			|| (offseti < 0
+			    && offset2i > (INTTYPE_MAXIMUM (HOST_WIDE_INT)
+					   + offseti))))
 	    {
 	      pd_data pd;
 	      tree rhs = gimple_assign_rhs1 (def_stmt);
@@ -2969,7 +2991,14 @@  vn_reference_lookup_3 (ao_ref *ref, tree
 		   && offset.is_constant (&offseti)
 		   && offset2.is_constant (&offset2i)
 		   && size2.is_constant (&size2i)
-		   && ranges_known_overlap_p (offset, maxsize, offset2, size2))
+		   && ranges_known_overlap_p (offset, maxsize, offset2, size2)
+		   /* Punt on overflows.  */
+		   && !((offseti > 0
+			 && offset2i < (INTTYPE_MINIMUM (HOST_WIDE_INT)
+					+ offseti))
+			|| (offseti < 0
+			    && offset2i > (INTTYPE_MAXIMUM (HOST_WIDE_INT)
+					   + offseti))))
 	    {
 	      pd_data pd;
 	      pd.rhs = SSA_VAL (def_rhs);