diff mbox

make gengtype more robust against user error

Message ID 1382737048-28620-1-git-send-email-dmalcolm@redhat.com
State New
Headers show

Commit Message

David Malcolm Oct. 25, 2013, 9:37 p.m. UTC
This patch addresses various forms of failure described in
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01974.html

It adds a "default: gcc_unreachable();" to the autogenerated switch()
statement in the routines for a base class, converting various kinds of
tag errors from leading to silent lack-of-field traversal into giving
run-time assertion failures (addressing (F) and (G))

It also issues an error within gengtype itself for a missing "desc"
(failure "B"), turning this into an error message from gengtype.

I found another potential failure mode:

(H) If you had:

class GTY((desc("%0.kind"), tag("0"))) foo
{
public:
  int kind;
  tree p;
};

class GTY((tag("1"))) bar : public foo
{
public:
  tree q;
};

static GTY(()) foo *dummy_foo;

and there are no explicit pointers to "bar" in the code, gengtype
treated it as unused, and silently omitted the case for it from
foo's marking routine.

I've updated set_gc_type_used so that it propagates usage down into
subclasses (which recurses), fixing this issue.

To do this efficiently we need to track subclasses in within gengtype,
so the patch also adds that, and uses it to eliminate an O(N^2).

Note that for error (G), if a class within the hierarchy omits a GTY
marker, gengtype doesn't parse it at all, and so doesn't parse the
inheritance information - so we can't issue a warning about this.
However, the lack of a tag will trigger a run-time assertion failure
due to hitting the "default:  gcc_unreachable();" in the switch.
The patch also adds a paragraph to the docs, spelling out the need
for evary class in such a hierarchy to have a GTY marker.

I believe this addresses all of the silent-lack-of-field-traversal
issues, converting them to gengtype errors or runtime assertions.
It also adds a handler for (E), turning this from a failure to
compile bogus C to a specific error in gengtype.

I'm bootstrapping/regtesting now.
OK for trunk if that passes?

gcc/
	* doc/gty.texi ("Inheritance and GTY"): Make it clear that
	to use autogenerated markers for a class-hierarchy, every class
	must have a GTY marker.
	* gengtype.h (struct type): Add linked list of subclasses to
	the "s" member of the union.
	(add_subclass): New decl.
	* gengtype-state.c (read_state_struct_type): Set up subclass
	linked list.
	* gengtype.c (get_ultimate_base_class): New.
	(add_subclass): New.
	(new_structure): Set up subclass linked list.
	(set_gc_used_type): Propagate usage information to subclasses.
	(output_mangled_typename): Use get_ultimate_base_class.
	(walk_subclasses): Use the subclass linked list, avoiding an
	O(N^2) when writing out all types.
	(walk_type): Issue an error if the base class is missing a tag,
	rather than generating bogus C code.  Add a gcc_unreachable
	default case, in case people omit tags from concrete subclasses,
	or get the values wrong.
	(write_func_for_structure): Issue an error for subclasses for
	which the base doesn't have a "desc", since otherwise the
	autogenerated routines for the base would silently fail to visit
	any subclass fields.
	(write_root): Use get_ultimate_base_class, tweaking constness of
	tp to match that function's signature.
---
 gcc/doc/gty.texi     |   7 +++
 gcc/gengtype-state.c |   2 +
 gcc/gengtype.c       | 119 ++++++++++++++++++++++++++++++++++++++-------------
 gcc/gengtype.h       |  10 +++++
 4 files changed, 109 insertions(+), 29 deletions(-)

Comments

Basile Starynkevitch Oct. 26, 2013, 4:43 a.m. UTC | #1
On Fri, 2013-10-25 at 17:37 -0400, David Malcolm wrote:
> This patch addresses various forms of failure described in
> http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01974.html
> 
> It adds a "default: gcc_unreachable();" to the autogenerated switch()
> statement in the routines for a base class, converting various kinds of
> tag errors from leading to silent lack-of-field traversal into giving
> run-time assertion failures (addressing (F) and (G))
> 
> It also issues an error within gengtype itself for a missing "desc"
> (failure "B"), turning this into an error message from gengtype.
> 
> I found another potential failure mode:
> 
> (H) If you had:
> 
> class GTY((desc("%0.kind"), tag("0"))) foo
> {
> public:
>   int kind;
>   tree p;
> };
> 
> class GTY((tag("1"))) bar : public foo
> {
> public:
>   tree q;
> };
> 
> static GTY(()) foo *dummy_foo;
> 
> and there are no explicit pointers to "bar" in the code, gengtype
> treated it as unused, and silently omitted the case for it from
> foo's marking routine.
> 
> I've updated set_gc_type_used so that it propagates usage down into
> subclasses (which recurses), fixing this issue.
> 
> To do this efficiently we need to track subclasses in within gengtype,
> so the patch also adds that, and uses it to eliminate an O(N^2).
> 
> Note that for error (G), if a class within the hierarchy omits a GTY
> marker, gengtype doesn't parse it at all, and so doesn't parse the
> inheritance information - so we can't issue a warning about this.
> However, the lack of a tag will trigger a run-time assertion failure
> due to hitting the "default:  gcc_unreachable();" in the switch.
> The patch also adds a paragraph to the docs, spelling out the need
> for evary class in such a hierarchy to have a GTY marker.
> 
> I believe this addresses all of the silent-lack-of-field-traversal
> issues, converting them to gengtype errors or runtime assertions.
> It also adds a handler for (E), turning this from a failure to
> compile bogus C to a specific error in gengtype.
> 
> I'm bootstrapping/regtesting now.
> OK for trunk if that passes?


I don't have the power to approve this patch, but I hope it will get
accepted.

Cheers.
Jeff Law Oct. 28, 2013, 9:19 p.m. UTC | #2
On 10/25/13 15:37, David Malcolm wrote:
> This patch addresses various forms of failure described in
> http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01974.html
>
> It adds a "default: gcc_unreachable();" to the autogenerated switch()
> statement in the routines for a base class, converting various kinds of
> tag errors from leading to silent lack-of-field traversal into giving
> run-time assertion failures (addressing (F) and (G))
>
> It also issues an error within gengtype itself for a missing "desc"
> (failure "B"), turning this into an error message from gengtype.
>
> I found another potential failure mode:
>
> (H) If you had:
>
> class GTY((desc("%0.kind"), tag("0"))) foo
> {
> public:
>    int kind;
>    tree p;
> };
>
> class GTY((tag("1"))) bar : public foo
> {
> public:
>    tree q;
> };
>
> static GTY(()) foo *dummy_foo;
>
> and there are no explicit pointers to "bar" in the code, gengtype
> treated it as unused, and silently omitted the case for it from
> foo's marking routine.
>
> I've updated set_gc_type_used so that it propagates usage down into
> subclasses (which recurses), fixing this issue.
>
> To do this efficiently we need to track subclasses in within gengtype,
> so the patch also adds that, and uses it to eliminate an O(N^2).
>
> Note that for error (G), if a class within the hierarchy omits a GTY
> marker, gengtype doesn't parse it at all, and so doesn't parse the
> inheritance information - so we can't issue a warning about this.
> However, the lack of a tag will trigger a run-time assertion failure
> due to hitting the "default:  gcc_unreachable();" in the switch.
> The patch also adds a paragraph to the docs, spelling out the need
> for evary class in such a hierarchy to have a GTY marker.
>
> I believe this addresses all of the silent-lack-of-field-traversal
> issues, converting them to gengtype errors or runtime assertions.
> It also adds a handler for (E), turning this from a failure to
> compile bogus C to a specific error in gengtype.
>
> I'm bootstrapping/regtesting now.
> OK for trunk if that passes?
>
> gcc/
> 	* doc/gty.texi ("Inheritance and GTY"): Make it clear that
> 	to use autogenerated markers for a class-hierarchy, every class
> 	must have a GTY marker.
> 	* gengtype.h (struct type): Add linked list of subclasses to
> 	the "s" member of the union.
> 	(add_subclass): New decl.
> 	* gengtype-state.c (read_state_struct_type): Set up subclass
> 	linked list.
> 	* gengtype.c (get_ultimate_base_class): New.
> 	(add_subclass): New.
> 	(new_structure): Set up subclass linked list.
> 	(set_gc_used_type): Propagate usage information to subclasses.
> 	(output_mangled_typename): Use get_ultimate_base_class.
> 	(walk_subclasses): Use the subclass linked list, avoiding an
> 	O(N^2) when writing out all types.
> 	(walk_type): Issue an error if the base class is missing a tag,
> 	rather than generating bogus C code.  Add a gcc_unreachable
> 	default case, in case people omit tags from concrete subclasses,
> 	or get the values wrong.
> 	(write_func_for_structure): Issue an error for subclasses for
> 	which the base doesn't have a "desc", since otherwise the
> 	autogenerated routines for the base would silently fail to visit
> 	any subclass fields.
> 	(write_root): Use get_ultimate_base_class, tweaking constness of
> 	tp to match that function's signature.
Thanks for diving into this stuff and getting it fixed.

OK for the trunk,

Jeff
David Malcolm Oct. 29, 2013, 1:24 a.m. UTC | #3
On Mon, 2013-10-28 at 15:19 -0600, Jeff Law wrote:
> On 10/25/13 15:37, David Malcolm wrote:
> > This patch addresses various forms of failure described in
> > http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01974.html
> >
> [...]
> Thanks for diving into this stuff and getting it fixed.
> 
> OK for the trunk,

Thanks; it passed bootstrap&regtesting on x86_64-unknown-linux, so I've
committed it to trunk as r204148, completing (I hope) the addition of
support in gengtype for some forms of simple class hierarchies.

I believe this may also complete the prerequisites for my conversion of
the symtable types into a C++ class hierarchy (in that IIRC all of those
patches were approved, but were pending this gengtype feature)... but
I'll have to see to what extent they've bitrotted (and recheck the
bootstrap/regtest, naturally).

Dave
diff mbox

Patch

diff --git a/gcc/doc/gty.texi b/gcc/doc/gty.texi
index 090f6a6..a64d110 100644
--- a/gcc/doc/gty.texi
+++ b/gcc/doc/gty.texi
@@ -497,6 +497,13 @@  The base class and its discriminator must be identified using the ``desc''
 option.  Each concrete subclass must use the ``tag'' option to identify
 which value of the discriminator it corresponds to.
 
+Every class in the hierarchy must have a @code{GTY(())} marker, as
+gengtype will only attempt to parse classes that have such a marker
+@footnote{Classes lacking such a marker will not be identified as being
+part of the hierarchy, and so the marking routines will not handle them,
+leading to a assertion failure within the marking routines due to an
+unknown tag value (assuming that assertions are enabled).}.
+
 @smallexample
 class GTY((desc("%h.kind"), tag("0"))) example_base
 @{
diff --git a/gcc/gengtype-state.c b/gcc/gengtype-state.c
index 1e9fade..fda473a 100644
--- a/gcc/gengtype-state.c
+++ b/gcc/gengtype-state.c
@@ -1615,6 +1615,8 @@  read_state_struct_type (type_p type)
       read_state_lang_bitmap (&(type->u.s.bitmap));
       read_state_type (&(type->u.s.lang_struct));
       read_state_type (&(type->u.s.base_class));
+      if (type->u.s.base_class)
+	add_subclass (type->u.s.base_class, type);
     }
   else
     {
diff --git a/gcc/gengtype.c b/gcc/gengtype.c
index 31e0f99..f35952e 100644
--- a/gcc/gengtype.c
+++ b/gcc/gengtype.c
@@ -137,6 +137,16 @@  xasprintf (const char *format, ...)
   return result;
 }
 
+/* Locate the ultimate base class of struct S.  */
+
+static const_type_p
+get_ultimate_base_class (const_type_p s)
+{
+  while (s->u.s.base_class)
+    s = s->u.s.base_class;
+  return s;
+}
+
 /* Input file handling. */
 
 /* Table of all input files.  */
@@ -667,6 +677,16 @@  resolve_typedef (const char *s, struct fileloc *pos)
   return p;
 }
 
+/* Add SUBCLASS to head of linked list of BASE's subclasses.  */
+
+void add_subclass (type_p base, type_p subclass)
+{
+  gcc_assert (union_or_struct_p (base));
+  gcc_assert (union_or_struct_p (subclass));
+
+  subclass->u.s.next_sibling_class = base->u.s.first_subclass;
+  base->u.s.first_subclass = subclass;
+}
 
 /* Create and return a new structure with tag NAME at POS with fields
    FIELDS and options O.  The KIND of structure must be one of
@@ -749,6 +769,8 @@  new_structure (const char *name, enum typekind kind, struct fileloc *pos,
   if (s->u.s.lang_struct)
     s->u.s.lang_struct->u.s.bitmap |= bitmap;
   s->u.s.base_class = base_class;
+  if (base_class)
+    add_subclass (base_class, s);
 
   return s;
 }
@@ -1535,6 +1557,12 @@  set_gc_used_type (type_p t, enum gc_used_enum level, type_p param[NUM_PARAM],
 	if (t->u.s.base_class)
 	  set_gc_used_type (t->u.s.base_class, level, param,
 			    allow_undefined_types);
+	/* Anything pointing to a base class might actually be pointing
+	   to a subclass.  */
+	for (type_p subclass = t->u.s.first_subclass; subclass;
+	     subclass = subclass->u.s.next_sibling_class)
+	  set_gc_used_type (subclass, level, param,
+			    allow_undefined_types);
 
 	FOR_ALL_INHERITED_FIELDS(t, f)
 	  {
@@ -2554,8 +2582,7 @@  output_mangled_typename (outf_p of, const_type_p t)
 	  /* For references to classes within an inheritance hierarchy,
 	     only ever reference the ultimate base class, since only
 	     it will have gt_ functions.  */
-	  while (t->u.s.base_class)
-	    t = t->u.s.base_class;
+	  t = get_ultimate_base_class (t);
 	  const char *id_for_tag = filter_type_name (t->u.s.tag);
 	  oprintf (of, "%lu%s", (unsigned long) strlen (id_for_tag),
 		   id_for_tag);
@@ -2630,30 +2657,28 @@  get_string_option (options_p opt, const char *key)
 static void
 walk_subclasses (type_p base, struct walk_type_data *d)
 {
-  for (type_p sub = structures; sub != NULL; sub = sub->next)
+  for (type_p sub = base->u.s.first_subclass; sub != NULL;
+       sub = sub->u.s.next_sibling_class)
     {
-      if (sub->u.s.base_class == base)
+      const char *type_tag = get_string_option (sub->u.s.opt, "tag");
+      if (type_tag)
 	{
-	  const char *type_tag = get_string_option (sub->u.s.opt, "tag");
-	  if (type_tag)
-	    {
-	      oprintf (d->of, "%*scase %s:\n", d->indent, "", type_tag);
-	      d->indent += 2;
-	      oprintf (d->of, "%*s{\n", d->indent, "");
-	      d->indent += 2;
-	      oprintf (d->of, "%*s%s *sub = static_cast <%s *> (x);\n",
-		       d->indent, "", sub->u.s.tag, sub->u.s.tag);
-	      const char *old_val = d->val;
-	      d->val = "(*sub)";
-	      walk_type (sub, d);
-	      d->val = old_val;
-	      d->indent -= 2;
-	      oprintf (d->of, "%*s}\n", d->indent, "");
-	      oprintf (d->of, "%*sbreak;\n", d->indent, "");
-	      d->indent -= 2;
-	    }
-	  walk_subclasses (sub, d);
+	  oprintf (d->of, "%*scase %s:\n", d->indent, "", type_tag);
+	  d->indent += 2;
+	  oprintf (d->of, "%*s{\n", d->indent, "");
+	  d->indent += 2;
+	  oprintf (d->of, "%*s%s *sub = static_cast <%s *> (x);\n",
+		   d->indent, "", sub->u.s.tag, sub->u.s.tag);
+	  const char *old_val = d->val;
+	  d->val = "(*sub)";
+	  walk_type (sub, d);
+	  d->val = old_val;
+	  d->indent -= 2;
+	  oprintf (d->of, "%*s}\n", d->indent, "");
+	  oprintf (d->of, "%*sbreak;\n", d->indent, "");
+	  d->indent -= 2;
 	}
+      walk_subclasses (sub, d);
     }
 }
 
@@ -3023,6 +3048,20 @@  walk_type (type_p t, struct walk_type_data *d)
 	  }
 	else if (desc)
 	  {
+	    /* We have a "desc" option on a struct, signifying the
+	       base class within a GC-managed inheritance hierarchy.
+	       The current code specialcases the base class, then walks
+	       into subclasses, recursing into this routine to handle them.
+	       This organization requires the base class to have a case in
+	       the switch statement, and hence a tag value is mandatory
+	       for the base class.   This restriction could be removed, but
+	       it would require some restructing of this code.  */
+	    if (!type_tag)
+	      {
+		error_at_line (d->line,
+			       "missing `tag' option for type `%s'",
+			       t->u.s.tag);
+	      }
 	    oprintf (d->of, "%*sswitch (", d->indent, "");
 	    output_escaped_param (d, desc, "desc");
 	    oprintf (d->of, ")\n");
@@ -3188,6 +3227,16 @@  walk_type (type_p t, struct walk_type_data *d)
 	    /* Add cases to handle subclasses.  */
 	    walk_subclasses (t, d);
 
+	    /* Ensure that if someone forgets a "tag" option that we don't
+	       silent fail to traverse that subclass's fields.  */
+	    if (!seen_default_p)
+	      {
+		oprintf (d->of, "%*s/* Unrecognized tag value.  */\n",
+			 d->indent, "");
+		oprintf (d->of, "%*sdefault: gcc_unreachable (); \n",
+			 d->indent, "");
+	      }
+
 	    /* End of the switch statement */
 	    oprintf (d->of, "%*s}\n", d->indent, "");
 	    d->indent -= 2;
@@ -3537,10 +3586,23 @@  write_func_for_structure (type_p orig_s, type_p s, type_p *param,
   options_p opt;
   struct walk_type_data d;
 
-  /* Don't write fns for subclasses, only for the ultimate base class
-     within an inheritance hierarchy.  */
   if (s->u.s.base_class)
-    return;
+    {
+      /* Verify that the base class has a "desc", since otherwise
+	 the traversal hooks there won't attempt to visit fields of
+	 subclasses such as this one.  */
+      const_type_p ubc = get_ultimate_base_class (s);
+      if ((!opts_have (ubc->u.s.opt, "user")
+	   && !opts_have (ubc->u.s.opt, "desc")))
+	error_at_line (&s->u.s.line,
+		       ("'%s' is a subclass of non-GTY(user) GTY class '%s'"
+			", but '%s' lacks a discriminator 'desc' option"),
+		       s->u.s.tag, ubc->u.s.tag, ubc->u.s.tag);
+
+      /* Don't write fns for subclasses, only for the ultimate base class
+	 within an inheritance hierarchy.  */
+      return;
+    }
 
   memset (&d, 0, sizeof (d));
   d.of = get_output_file_for_structure (s, param);
@@ -4452,7 +4514,7 @@  write_root (outf_p f, pair_p v, type_p type, const char *name, int has_length,
 
     case TYPE_POINTER:
       {
-	type_p tp;
+	const_type_p tp;
 
 	if (!start_root_entry (f, v, name, line))
 	  return;
@@ -4461,8 +4523,7 @@  write_root (outf_p f, pair_p v, type_p type, const char *name, int has_length,
 
 	if (!has_length && union_or_struct_p (tp))
 	  {
-	    while (tp->u.s.base_class)
-	      tp = tp->u.s.base_class;
+	    tp = get_ultimate_base_class (tp);
 	    const char *id_for_tag = filter_type_name (tp->u.s.tag);
 	    oprintf (f, "    &gt_ggc_mx_%s,\n", id_for_tag);
 	    if (emit_pch)
diff --git a/gcc/gengtype.h b/gcc/gengtype.h
index 2d20bf9..0c3ec96 100644
--- a/gcc/gengtype.h
+++ b/gcc/gengtype.h
@@ -293,6 +293,15 @@  struct type {
       type_p lang_struct;
 
       type_p base_class; /* the parent class, if any.  */
+
+      /* The following two fields are not serialized in state files, and
+	 are instead reconstructed on load.  */
+
+      /* The head of a singly-linked list of immediate descendents in
+	 the inheritance hierarchy.  */
+      type_p first_subclass;
+      /* The next in that list.  */
+      type_p next_sibling_class;
     } s;
 
     /* when TYPE_SCALAR: */
@@ -424,6 +433,7 @@  extern char *xasprintf (const char *, ...) ATTRIBUTE_PRINTF_1;
 extern void do_typedef (const char *s, type_p t, struct fileloc *pos);
 extern void do_scalar_typedef (const char *s, struct fileloc *pos);
 extern type_p resolve_typedef (const char *s, struct fileloc *pos);
+extern void add_subclass (type_p base, type_p subclass);
 extern type_p new_structure (const char *name, enum typekind kind,
 			     struct fileloc *pos, pair_p fields,
 			     options_p o, type_p base);