diff mbox

Simplify X / X, 0 / X and X % X

Message ID 8721f2da-686b-e942-eda1-3bf56a9b743a@redhat.com
State New
Headers show

Commit Message

Jeff Law Nov. 11, 2016, 4:01 p.m. UTC
On 11/07/2016 08:19 AM, Jeff Law wrote:
> On 11/07/2016 03:02 AM, Richard Biener wrote:
>> On Sat, Nov 5, 2016 at 3:30 AM, Jeff Law <law@redhat.com> wrote:
>>> On 11/04/2016 02:07 PM, Marc Glisse wrote:
>>>>
>>>> Hello,
>>>>
>>>> since we were discussing this recently...
>>>>
>>>> The condition is copied from the existing 0 % X case, visible in the
>>>> context of the diff.
>>>>
>>>> As far as I understand, the main case where we do not want to optimize
>>>> is during constexpr evaluation in the C++ front-end (it wants to detect
>>>> the undefined behavior), and with late folding I think this means we
>>>> only need to care about an explicit 0/0, not about X/X where X would
>>>> become 0 after the simplification.
>>>>
>>>> And later, if we do have something like X/0, we could handle it the
>>>> same
>>>> way as we currently handle *(char*)0, insert a trap after that
>>>> instruction and clear the following code, which likely gives better
>>>> code
>>>> than replacing 0/0 with 1.
>>>
>>> Yup.  I'd prefer to insert a trap if we ultimately expose a division
>>> by zero
>>> -- including cases where that division occurs as a result of a PHI
>>> arg being
>>> zero and the PHI result being used as a denominator in a division
>>> expression.
>>>
>>> It ought to be extremely easy to detect & transform that case (and
>>> probably
>>> warn for it too).
>>
>> We have gimple-ssa-isolate-paths.c for that, right?
> Right.   I was thinking about instrumenting for it today to see if it's
> worth any effort.  It shouldn't take more than a few minutes once I
> refamiliarize myself with isolate-paths.
So we get a few hits in GCC itself due to the structure of our vec 
implementation.  Essentially the length() can return 0 explicitly:

  unsigned length (void) const
   { return m_vec ? m_vec->length () : 0; }

So if we feed the result of a length() call into a div/mod operation we 
can get a path were an explicit zero reaches the div/mod.  It'll 
ultimately look something like this in the IL:


   # iftmp.0_7 = PHI <0(2), iftmp.0_6(3)>
   _5 = j_4(D) % iftmp.0_7;


If this feels familiar, it should.  The same general style of coding is 
what led me down this path in 2011/2013 with other parts of our vec 
implementation (VEC_BASE which was then used in a dereference).

There's another case in the GCD code in tree-data-ref.c that looks like 
it was trying to avoid a false positive uninit warning, but in the 
process introduced a path where we could divide by zero.


Anyway, with this patch we isolate the path where the 0 feeds a div/mod 
insn.  On the isolated path we replace the div/mod with an explicit trap 
in a manner similar to how we handled conditional NULL pointer dereferences.

This in and of itself is only mildly interesting -- it'll enable 
generation of conditional traps on targets that have that capability 
(verified a trivial case on ppc).  But the real point of the path 
isolation code is pick up secondary optimization/analysis opportunities.

For example, if we isolate a path, the existing PHI sometimes turns into 
a degenerate and can be propagated away completely.  With the erroneous 
path isolated and not merging back into the main path, the CFG for the 
main path often simplifies as well.  These kinds of things happened 
regularly with the NULL pointer dereference isolation code.

I don't see any of those happening in the division or modulo by zero 
cases I've looked at from within GCC itself.  There's nothing inherently 
different IMHO, it's just a case that there's fewer opportunities and 
thus fewer chances to actually improve things.

There's also a series of explicit division-by-zero opportunities that 
show up during a build.  They're all a result of an explicit division by 
zero from inside libgcc2.c.   It's almost certain that code is supposed 
to generate a fault (and continues to do so, just via a different 
mechanism).  There may be some chance for additional optimization in 
this case.  Essentially we'd replace:


          if (d0 == 0)
             d0 = 1 / d0;        /* Divide intentionally by zero.  */
          [ more code ]

With

	if (d0 == 0)
	  __builtin_trap ();
	[ more code ]

And we'd then know that d0 would have a ~[0,0] range the the remainder 
of the code.

Anyway, the patch looks a lot bigger than it really is due to a bit of 
refactoring.  I just hated how we ended up duplicating the actual 
mechanics of the transformation.

I also reviewed the situation with LLVM.  When they see a division by 
zero, it gets replaced with a constant.  I can certainly see the 
pros/cons of that vs trapping.  I think explicit trapping more closely 
matches what we'd likely get from the hardware or a library routine 
implementing div/mod.  So I'm sticking with trapping.



Bootstrapped, regression tested and installed on the trunk.

Jeff
* gimple-ssa-isolate-paths.c (is_divmod_with_given_divisor): New
	function.
	(stmt_uses_name_in_undefined_way): New function, extracted from
	find_implicit_erroneous_behavior and extended for div/mod case.
	(stmt_uses_0_or_null_in_undefined_way): New function, extracted from
	find_explicit_erroneous_behavior and extended for div/mod case.
	(find_implicit_erroneous_behavior): Use new helper function.
	(find_explicit_erroneous_behavior): Use new helper function.


	* gcc.dg/tree-ssa/isolate-6.c: New test.
	* gcc.dg/tree-ssa/isolate-7.c: New test.
diff mbox

Patch

diff --git a/gcc/gimple-ssa-isolate-paths.c b/gcc/gimple-ssa-isolate-paths.c
index 9d2fc8a..84048d3 100644
--- a/gcc/gimple-ssa-isolate-paths.c
+++ b/gcc/gimple-ssa-isolate-paths.c
@@ -206,6 +206,124 @@  isolate_path (basic_block bb, basic_block duplicate,
   return duplicate;
 }
 
+/* Return TRUE if STMT is a div/mod operation using DIVISOR as the divisor.
+   FALSE otherwise.  */
+
+static bool
+is_divmod_with_given_divisor (gimple *stmt, tree divisor)
+{
+  /* Only assignments matter.  */
+  if (!is_gimple_assign (stmt))
+    return false;
+
+  /* Check for every DIV/MOD expression.  */
+  enum tree_code rhs_code = gimple_assign_rhs_code (stmt);
+  if (rhs_code == TRUNC_DIV_EXPR
+      || rhs_code == FLOOR_DIV_EXPR
+      || rhs_code == CEIL_DIV_EXPR
+      || rhs_code == EXACT_DIV_EXPR
+      || rhs_code == ROUND_DIV_EXPR
+      || rhs_code == TRUNC_MOD_EXPR
+      || rhs_code == FLOOR_MOD_EXPR
+      || rhs_code == CEIL_MOD_EXPR
+      || rhs_code == ROUND_MOD_EXPR)
+    {
+      /* Pointer equality is fine when DIVISOR is an SSA_NAME, but
+	 not sufficient for constants which may have different types.  */
+      if (operand_equal_p (gimple_assign_rhs2 (stmt), divisor, 0))
+	return true;
+    }
+  return false;
+}
+
+/* NAME is an SSA_NAME that we have already determined has the value 0 or NULL.
+
+   Return TRUE if USE_STMT uses NAME in a way where a 0 or NULL value results
+   in undefined behavior, FALSE otherwise
+
+   LOC is used for issuing diagnostics.  This case represents potential
+   undefined behavior exposed by path splitting and that's reflected in
+   the diagnostic.  */
+
+bool
+stmt_uses_name_in_undefined_way (gimple *use_stmt, tree name, location_t loc)
+{
+  /* If we are working with a non pointer type, then see
+     if this use is a DIV/MOD operation using NAME as the
+     divisor.  */
+  if (!POINTER_TYPE_P (TREE_TYPE (name)))
+    {
+      if (!flag_non_call_exceptions)
+	return is_divmod_with_given_divisor (use_stmt, name);
+      return false;
+    }
+
+  /* NAME is a pointer, so see if it's used in a context where it must
+     be non-NULL.  */
+  bool by_dereference
+    = infer_nonnull_range_by_dereference (use_stmt, name);
+
+  if (by_dereference
+      || infer_nonnull_range_by_attribute (use_stmt, name))
+    {
+
+      if (by_dereference)
+	{
+	  warning_at (loc, OPT_Wnull_dereference,
+		      "potential null pointer dereference");
+	  if (!flag_isolate_erroneous_paths_dereference)
+	    return false;
+	}
+      else
+	{
+	  if (!flag_isolate_erroneous_paths_attribute)
+	    return false;
+	}
+      return true;
+    }
+  return false;
+}
+
+/* Return TRUE if USE_STMT uses 0 or NULL in a context which results in
+   undefined behavior, FALSE otherwise.
+
+   These cases are explicit in the IL.  */
+
+bool
+stmt_uses_0_or_null_in_undefined_way (gimple *stmt)
+{
+  if (!flag_non_call_exceptions
+      && is_divmod_with_given_divisor (stmt, integer_zero_node))
+    return true;
+
+  /* By passing null_pointer_node, we can use the
+     infer_nonnull_range functions to detect explicit NULL
+     pointer dereferences and other uses where a non-NULL
+     value is required.  */
+
+  bool by_dereference
+    = infer_nonnull_range_by_dereference (stmt, null_pointer_node);
+  if (by_dereference
+      || infer_nonnull_range_by_attribute (stmt, null_pointer_node))
+    {
+      if (by_dereference)
+	{
+	  location_t loc = gimple_location (stmt);
+	  warning_at (loc, OPT_Wnull_dereference,
+		      "null pointer dereference");
+	  if (!flag_isolate_erroneous_paths_dereference)
+	    return false;
+	}
+      else
+	{
+	  if (!flag_isolate_erroneous_paths_attribute)
+	    return false;
+	}
+      return true;
+    }
+  return false;
+}
+
 /* Look for PHI nodes which feed statements in the same block where
    the value of the PHI node implies the statement is erroneous.
 
@@ -243,11 +361,6 @@  find_implicit_erroneous_behavior (void)
 	  gphi *phi = si.phi ();
 	  tree lhs = gimple_phi_result (phi);
 
-	  /* If the result is not a pointer, then there is no need to
- 	     examine the arguments.  */
-	  if (!POINTER_TYPE_P (TREE_TYPE (lhs)))
-	    continue;
-
 	  /* PHI produces a pointer result.  See if any of the PHI's
 	     arguments are NULL.
 
@@ -315,29 +428,12 @@  find_implicit_erroneous_behavior (void)
 		  if (gimple_bb (use_stmt) != bb)
 		    continue;
 
-		  bool by_dereference 
-		    = infer_nonnull_range_by_dereference (use_stmt, lhs);
+		  location_t loc = gimple_location (use_stmt)
+		    ? gimple_location (use_stmt)
+		    : gimple_phi_arg_location (phi, i);
 
-		  if (by_dereference 
-		      || infer_nonnull_range_by_attribute (use_stmt, lhs))
+		  if (stmt_uses_name_in_undefined_way (use_stmt, lhs, loc))
 		    {
-		      location_t loc = gimple_location (use_stmt)
-			? gimple_location (use_stmt)
-			: gimple_phi_arg_location (phi, i);
-
-		      if (by_dereference)
-			{
-			  warning_at (loc, OPT_Wnull_dereference,
-				      "potential null pointer dereference");
-			  if (!flag_isolate_erroneous_paths_dereference)
-			    continue;
-			}
-		      else 
-			{
-			  if (!flag_isolate_erroneous_paths_attribute)
-			    continue;
-			}
-
 		      duplicate = isolate_path (bb, duplicate, e,
 						use_stmt, lhs, false);
 
@@ -383,29 +479,8 @@  find_explicit_erroneous_behavior (void)
 	{
 	  gimple *stmt = gsi_stmt (si);
 
-	  /* By passing null_pointer_node, we can use the
-	     infer_nonnull_range functions to detect explicit NULL
-	     pointer dereferences and other uses where a non-NULL
-	     value is required.  */
-	  
-	  bool by_dereference
-	    = infer_nonnull_range_by_dereference (stmt, null_pointer_node);
-	  if (by_dereference
-	      || infer_nonnull_range_by_attribute (stmt, null_pointer_node))
+	  if (stmt_uses_0_or_null_in_undefined_way (stmt))
 	    {
-	      if (by_dereference)
-		{
-		  warning_at (gimple_location (stmt), OPT_Wnull_dereference,
-			      "null pointer dereference");
-		  if (!flag_isolate_erroneous_paths_dereference)
-		    continue;
-		}
-	      else
-		{
-		  if (!flag_isolate_erroneous_paths_attribute)
-		    continue;
-		}
-
 	      insert_trap (&si, null_pointer_node);
 	      bb = gimple_bb (gsi_stmt (si));
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/isolate-6.c b/gcc/testsuite/gcc.dg/tree-ssa/isolate-6.c
new file mode 100644
index 0000000..ec7c57a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/isolate-6.c
@@ -0,0 +1,19 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-isolate-paths" } */
+
+int x, y;
+
+static inline int
+z ()
+{
+  return x ? y : 0;
+}
+
+int
+lower_for (int j)
+{
+  return j % z ();
+}
+
+/* { dg-final { scan-tree-dump-times "__builtin_trap" 1 "isolate-paths"} } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/isolate-7.c b/gcc/testsuite/gcc.dg/tree-ssa/isolate-7.c
new file mode 100644
index 0000000..e63d5a0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/isolate-7.c
@@ -0,0 +1,15 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-isolate-paths" } */
+
+extern int oof ();
+extern int x;
+_Bool
+gcd_of_steps_may_divide_p ()
+{
+  long cd = 0, val;
+  if (x)
+    cd = oof ();
+  return val % cd == 0;
+}
+/* { dg-final { scan-tree-dump-times "__builtin_trap" 1 "isolate-paths"} } */
+