diff mbox

[12/22] Add source-ranges for trees

Message ID 1441916913-11547-13-git-send-email-dmalcolm@redhat.com
State New
Headers show

Commit Message

David Malcolm Sept. 10, 2015, 8:28 p.m. UTC
This patch adds a way to associate source range information with tree
expressions and decls, for later use by diagnostics.

It's a poor implementation which is unacceptable on multiple grounds:
for starters, it adds a source_range (8 bytes) to struct tree_exp and
to struct tree_decl_minimal.  (It also doesn't bootstrap).

Adding the source_range fields above covers most expressions, but
doesn't help with references to decls and to constants.  Consider:

  int test (int foo)
  {
    return foo * 100;
           ^^^   ^^^
  }

I want gcc to retain information so that diagnostics can underline
the "foo" and "100" above, but all we have is a VAR_DECL and an
INTEGER_CST.  The former's location is in at the top of the
function, and the latter has no location.

Hence the patch adds a new SOURCE_RANGE tree code.  This
is a kind of unary operator or wrapper, which wraps things that don't
have a range field themselves (further bloating the IR).

They get thrown away during gimplification.

This works for simple cases, but isn't yet complete: there are plenty
of places where the frontends will fail if they see a SOURCE_RANGE.

So, as I said, it's a poor implementation, but the followup patches
needed some way to record source range information for trees.

Some alternate ideas for how this could be implementated:

(a) somehow compress the location and range into the 4 bytes taken up
    by a source_location: are we really using all 32-bits?  I suspect
    that in real-world code ranges, there's enough closeness between the
    location, range start and range finish that we can pack them into
    32 bits for most cases, with some kind of lookaside for those that
    don't fit.

(b) introduce some kind of "DECL_USAGE" or "DECL_REF" tree node, an
    expression that references a decl, segregating decls from expression
    trees, putting the location information into the DECL_USAGE node.

(c) only bother tracking the information if -fdiagnostics-show-caret
    is enabled (note, though that it's on by default; if we go down
    this path, maybe it's another thing for torture testing?).

(d) only track information temporarily (e.g. in c_expr, rather than in
    tree), discarding it as the tree is built, or perhaps special-casing
    some places where it's particular worth preserving e.g. the ranges
    of the arguments at a callsite, so the user can easily identify
    whatever "argument 3" means.

etc.  Ideas?

gcc/c-family/ChangeLog:
	* c-common.c (c_fully_fold_internal): Capture existing souce_range,
	and store it on the result.
	* c-pretty-print.c (c_pretty_printer::expression): Handle
	SOURCE_RANGE.

gcc/c/ChangeLog:
	* c-typeck.c (array_to_pointer_conversion): Handle SOURCE_RANGE,
	and preserve any source range information.
	(build_function_call_vec): Handle SOURCE_RANGE.
	(lvalue_p): Likewise.
	(c_finish_return): Likewise.

gcc/ChangeLog:
	* gimplify.c (gimplify_expr): Throw away SOURCE_RANGEs.
	* print-tree.c (print_node): Print any source range information.
	* tree-core.h (struct tree_exp): Add a "range" field.
	(struct tree_decl_minimal): Likewise.
	* tree.c (build1_stat): Initialize EXPR_LOCATION_RANGE (t).
	(build_decl_stat): Add overload taking a source_range.
	(set_source_range): New functions.
	* tree.def (SOURCE_RANGE): New tree code.
	* tree.h (CAN_HAVE_RANGE_P): New.
	(EXPR_LOCATION_RANGE): New.
	(EXPR_RANGE_OR_LOC): New.
	(EXPR_HAS_RANGE): New.
	(DECL_LOCATION_RANGE): New.
	(build_decl_stat): New overload.
	(set_source_range): New decls.
---
 gcc/c-family/c-common.c       | 10 +++++++++-
 gcc/c-family/c-pretty-print.c |  4 ++++
 gcc/c/c-typeck.c              | 23 ++++++++++++++++++++++-
 gcc/gimplify.c                |  4 ++++
 gcc/print-tree.c              | 21 +++++++++++++++++++++
 gcc/tree-core.h               |  2 ++
 gcc/tree.c                    | 42 ++++++++++++++++++++++++++++++++++++++----
 gcc/tree.def                  |  2 ++
 gcc/tree.h                    | 21 +++++++++++++++++++++
 9 files changed, 123 insertions(+), 6 deletions(-)
diff mbox

Patch

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index c02ea39..ff6f90f 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -1178,6 +1178,7 @@  c_fully_fold_internal (tree expr, bool in_init, bool *maybe_const_operands,
   bool op0_const_self = true, op1_const_self = true, op2_const_self = true;
   bool nowarning = TREE_NO_WARNING (expr);
   bool unused_p;
+  source_range old_range;
 
   /* This function is not relevant to C++ because C++ folds while
      parsing, and may need changes to be correct for C++ when C++
@@ -1193,6 +1194,9 @@  c_fully_fold_internal (tree expr, bool in_init, bool *maybe_const_operands,
       || code == SAVE_EXPR)
     return expr;
 
+  if (IS_EXPR_CODE_CLASS (kind))
+    old_range = EXPR_LOCATION_RANGE (expr);
+
   /* Operands of variable-length expressions (function calls) have
      already been folded, as have __builtin_* function calls, and such
      expressions cannot occur in constant expressions.  */
@@ -1617,7 +1621,11 @@  c_fully_fold_internal (tree expr, bool in_init, bool *maybe_const_operands,
       TREE_NO_WARNING (ret) = 1;
     }
   if (ret != expr)
-    protected_set_expr_location (ret, loc);
+    {
+      protected_set_expr_location (ret, loc);
+      if (IS_EXPR_CODE_CLASS (kind))
+	set_source_range (&ret, old_range.m_start, old_range.m_finish);
+    }
   return ret;
 }
 
diff --git a/gcc/c-family/c-pretty-print.c b/gcc/c-family/c-pretty-print.c
index e2809cf..c70cfe0 100644
--- a/gcc/c-family/c-pretty-print.c
+++ b/gcc/c-family/c-pretty-print.c
@@ -2319,6 +2319,10 @@  c_pretty_printer::expression (tree e)
       expression (C_MAYBE_CONST_EXPR_EXPR (e));
       break;
 
+    case SOURCE_RANGE:
+      expression (TREE_OPERAND (e, 0));
+      break;
+
     default:
       pp_unsupported_tree (this, e);
       break;
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index a755a7e..6c60dc8 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -1815,6 +1815,7 @@  array_to_pointer_conversion (location_t loc, tree exp)
   tree adr;
   tree restype = TREE_TYPE (type);
   tree ptrtype;
+  tree source_range = NULL;
 
   gcc_assert (TREE_CODE (type) == ARRAY_TYPE);
 
@@ -1828,6 +1829,12 @@  array_to_pointer_conversion (location_t loc, tree exp)
   if (INDIRECT_REF_P (exp))
     return convert (ptrtype, TREE_OPERAND (exp, 0));
 
+  if (TREE_CODE (exp) == SOURCE_RANGE)
+    {
+      source_range = exp;
+      exp = TREE_OPERAND (exp, 0);
+    }
+
   /* In C++ array compound literals are temporary objects unless they are
      const or appear in namespace scope, so they are destroyed too soon
      to use them for much of anything  (c++/53220).  */
@@ -1841,7 +1848,10 @@  array_to_pointer_conversion (location_t loc, tree exp)
     }
 
   adr = build_unary_op (loc, ADDR_EXPR, exp, 1);
-  return convert (ptrtype, adr);
+  tree result = convert (ptrtype, adr);
+  if (source_range)
+    set_source_range (&result, EXPR_LOCATION_RANGE (source_range));
+  return result;
 }
 
 /* Convert the function expression EXP to a pointer.  */
@@ -2867,6 +2877,9 @@  build_function_call_vec (location_t loc, vec<location_t> arg_loc,
   /* Strip NON_LVALUE_EXPRs, etc., since we aren't using as an lvalue.  */
   STRIP_TYPE_NOPS (function);
 
+  if (TREE_CODE (function) == SOURCE_RANGE)
+    function = TREE_OPERAND (function, 0);
+
   /* Convert anything with function type to a pointer-to-function.  */
   if (TREE_CODE (function) == FUNCTION_DECL)
     {
@@ -4306,6 +4319,9 @@  lvalue_p (const_tree ref)
     case BIND_EXPR:
       return TREE_CODE (TREE_TYPE (ref)) == ARRAY_TYPE;
 
+    case SOURCE_RANGE:
+      return lvalue_p (TREE_OPERAND (ref, 0));
+
     default:
       return 0;
     }
@@ -9466,6 +9482,7 @@  c_finish_return (location_t loc, tree retval, tree origtype)
 	    case NON_LVALUE_EXPR:
 	    case PLUS_EXPR:
 	    case POINTER_PLUS_EXPR:
+	    case SOURCE_RANGE:
 	      inner = TREE_OPERAND (inner, 0);
 	      continue;
 
@@ -9475,6 +9492,8 @@  c_finish_return (location_t loc, tree retval, tree origtype)
 		 don't give a warning.  */
 	      {
 		tree op1 = TREE_OPERAND (inner, 1);
+		if (TREE_CODE (op1) == SOURCE_RANGE)
+		  op1 = TREE_OPERAND (op1, 0);
 
 		while (!POINTER_TYPE_P (TREE_TYPE (op1))
 		       && (CONVERT_EXPR_P (op1)
@@ -9490,6 +9509,8 @@  c_finish_return (location_t loc, tree retval, tree origtype)
 
 	    case ADDR_EXPR:
 	      inner = TREE_OPERAND (inner, 0);
+	      if (TREE_CODE (inner) == SOURCE_RANGE)
+		inner = TREE_OPERAND (inner, 0);
 
 	      while (REFERENCE_CLASS_P (inner)
 		     && !INDIRECT_REF_P (inner))
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index b7a918b..47508b3 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -7962,6 +7962,10 @@  gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
 	 at the toplevel.  */
       STRIP_USELESS_TYPE_CONVERSION (*expr_p);
 
+      /* For now, strip away source ranges here.  */
+      while (TREE_CODE (*expr_p) == SOURCE_RANGE)
+	*expr_p = TREE_OPERAND (*expr_p, 0);
+
       /* Remember the expr.  */
       save_expr = *expr_p;
 
diff --git a/gcc/print-tree.c b/gcc/print-tree.c
index ea50056..8b3794a 100644
--- a/gcc/print-tree.c
+++ b/gcc/print-tree.c
@@ -936,6 +936,27 @@  print_node (FILE *file, const char *prefix, tree node, int indent)
       expanded_location xloc = expand_location (EXPR_LOCATION (node));
       indent_to (file, indent+4);
       fprintf (file, "%s:%d:%d", xloc.file, xloc.line, xloc.column);
+
+      /* Print the range, if any */
+      source_range r = EXPR_LOCATION_RANGE (node);
+      if (r.m_start)
+	{
+	  xloc = expand_location (r.m_start);
+	  fprintf (file, " start: %s:%d:%d", xloc.file, xloc.line, xloc.column);
+	}
+      else
+	{
+	  fprintf (file, " start: unknown");
+	}
+      if (r.m_finish)
+	{
+	  xloc = expand_location (r.m_finish);
+	  fprintf (file, " finish: %s:%d:%d", xloc.file, xloc.line, xloc.column);
+	}
+      else
+	{
+	  fprintf (file, " finish: unknown");
+	}
     }
 
   fprintf (file, ">");
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index 64d1fe4..6931ad9 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -1235,6 +1235,7 @@  enum omp_clause_proc_bind_kind
 struct GTY(()) tree_exp {
   struct tree_typed typed;
   location_t locus;
+  source_range range;
   tree GTY ((special ("tree_exp"),
 	     desc ("TREE_CODE ((tree) &%0)")))
     operands[1];
@@ -1404,6 +1405,7 @@  struct GTY (()) tree_binfo {
 struct GTY(()) tree_decl_minimal {
   struct tree_common common;
   location_t locus;
+  source_range range;
   unsigned int uid;
   tree name;
   tree context;
diff --git a/gcc/tree.c b/gcc/tree.c
index ed64fe7..d1595c2 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -4320,6 +4320,8 @@  build1_stat (enum tree_code code, tree type, tree node MEM_STAT_DECL)
 
   TREE_TYPE (t) = type;
   SET_EXPR_LOCATION (t, UNKNOWN_LOCATION);
+  EXPR_LOCATION_RANGE (t).m_start = UNKNOWN_LOCATION;
+  EXPR_LOCATION_RANGE (t).m_finish = UNKNOWN_LOCATION;
   TREE_OPERAND (t, 0) = node;
   if (node && !TYPE_P (node))
     {
@@ -4641,19 +4643,20 @@  build_nt_call_vec (tree fn, vec<tree, va_gc> *args)
 /* Create a DECL_... node of code CODE, name NAME and data type TYPE.
    We do NOT enter this node in any sort of symbol table.
 
-   LOC is the location of the decl.
+   RANGE is the source location of the decl.
 
    layout_decl is used to set up the decl's storage layout.
    Other slots are initialized to 0 or null pointers.  */
 
 tree
-build_decl_stat (location_t loc, enum tree_code code, tree name,
-    		 tree type MEM_STAT_DECL)
+build_decl_stat (source_range range, enum tree_code code, tree name,
+		 tree type MEM_STAT_DECL)
 {
   tree t;
 
   t = make_node_stat (code PASS_MEM_STAT);
-  DECL_SOURCE_LOCATION (t) = loc;
+  DECL_SOURCE_LOCATION (t) = range.m_start;
+  DECL_LOCATION_RANGE (t) = range;
 
 /*  if (type == error_mark_node)
     type = integer_type_node; */
@@ -4669,6 +4672,16 @@  build_decl_stat (location_t loc, enum tree_code code, tree name,
   return t;
 }
 
+/* As "build_decl_stat" above, but for location LOC. */
+
+tree
+build_decl_stat (location_t loc, enum tree_code code, tree name,
+		 tree type MEM_STAT_DECL)
+{
+  return build_decl_stat (source_range::from_location (loc),
+			  code, name, type PASS_MEM_STAT);
+}
+
 /* Builds and returns function declaration with NAME and TYPE.  */
 
 tree
@@ -13646,5 +13659,26 @@  nonnull_arg_p (const_tree arg)
   return false;
 }
 
+void
+set_source_range (tree *expr, location_t start, location_t finish)
+{
+  /* Add wrapper nodes for e.g. mentions of a parm_decl
+     in an expression, constants, etc.  */
+  if (!EXPR_P (*expr))
+    {
+      tree wrapper = build1 (SOURCE_RANGE, TREE_TYPE (*expr), *expr);
+      SET_EXPR_LOCATION (wrapper, start);
+      *expr = wrapper;
+    }
+
+  EXPR_LOCATION_RANGE (*expr).m_start = start;
+  EXPR_LOCATION_RANGE (*expr).m_finish = finish;
+}
+
+void
+set_source_range (tree *expr, source_range src_range)
+{
+  set_source_range (expr, src_range.m_start, src_range.m_finish);
+}
 
 #include "gt-tree.h"
diff --git a/gcc/tree.def b/gcc/tree.def
index 56580af..6ad84d7 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1380,6 +1380,8 @@  DEFTREECODE (CILK_SPAWN_STMT, "cilk_spawn_stmt", tcc_statement, 1)
 /* Cilk Sync statement: Does not have any operands.  */
 DEFTREECODE (CILK_SYNC_STMT, "cilk_sync_stmt", tcc_statement, 0)
 
+DEFTREECODE (SOURCE_RANGE, "source_range", tcc_expression, 1)
+
 /*
 Local variables:
 mode:c
diff --git a/gcc/tree.h b/gcc/tree.h
index e500151..66419d4 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1066,6 +1066,16 @@  extern void omp_clause_range_check_failed (const_tree, const char *, int,
 #define EXPR_FILENAME(NODE) LOCATION_FILE (EXPR_CHECK ((NODE))->exp.locus)
 #define EXPR_LINENO(NODE) LOCATION_LINE (EXPR_CHECK (NODE)->exp.locus)
 
+#define CAN_HAVE_RANGE_P(NODE) (CAN_HAVE_LOCATION_P (NODE))
+#define EXPR_LOCATION_RANGE(NODE) (EXPR_CHECK ((NODE))->exp.range)
+#define EXPR_RANGE_OR_LOC(NODE, LOCUS) (CAN_HAVE_RANGE_P (NODE) \
+					? (NODE)->exp.range \
+					: source_range::from_location (LOCUS))
+#define EXPR_HAS_RANGE(NODE) \
+    (CAN_HAVE_RANGE_P (NODE) \
+     ? EXPR_LOCATION_RANGE (NODE).m_start != UNKNOWN_LOCATION \
+     : false)
+
 /* True if a tree is an expression or statement that can have a
    location.  */
 #define CAN_HAVE_LOCATION_P(NODE) ((NODE) && EXPR_P (NODE))
@@ -2092,6 +2102,9 @@  extern machine_mode element_mode (const_tree t);
 #define DECL_IS_BUILTIN(DECL) \
   (LOCATION_LOCUS (DECL_SOURCE_LOCATION (DECL)) <= BUILTINS_LOCATION)
 
+#define DECL_LOCATION_RANGE(NODE) \
+  (DECL_MINIMAL_CHECK (NODE)->decl_minimal.range)
+
 /*  For FIELD_DECLs, this is the RECORD_TYPE, UNION_TYPE, or
     QUAL_UNION_TYPE node that the field is a member of.  For VAR_DECL,
     PARM_DECL, FUNCTION_DECL, LABEL_DECL, RESULT_DECL, and CONST_DECL
@@ -3784,6 +3797,8 @@  extern tree build_tree_list_vec_stat (const vec<tree, va_gc> *MEM_STAT_DECL);
 #define build_tree_list_vec(v) build_tree_list_vec_stat (v MEM_STAT_INFO)
 extern tree build_decl_stat (location_t, enum tree_code,
 			     tree, tree MEM_STAT_DECL);
+extern tree build_decl_stat (source_range, enum tree_code,
+			     tree, tree MEM_STAT_DECL);
 extern tree build_fn_decl (const char *, tree);
 #define build_decl(l,c,t,q) build_decl_stat (l, c, t, q MEM_STAT_INFO)
 extern tree build_translation_unit_decl (tree);
@@ -5133,6 +5148,12 @@  type_with_alias_set_p (const_tree t)
   return false;
 }
 
+extern void
+set_source_range (tree *expr, location_t start, location_t finish);
+
+extern void
+set_source_range (tree *expr, source_range src_range);
+
 extern void gt_ggc_mx (tree &);
 extern void gt_pch_nx (tree &);
 extern void gt_pch_nx (tree &, gt_pointer_operator, void *);