diff mbox series

C++ Module Binding Vector

Message ID 502423a4-a715-c2d9-039e-e473a81d6331@acm.org
State New
Headers show
Series C++ Module Binding Vector | expand

Commit Message

Nathan Sidwell Dec. 2, 2020, 2:19 p.m. UTC
This is modified from the original patch.  firstly it contains a few 
more necessary bits.  But mainly I renamed the entities from 
'MODULE_foo' to 'BINDING_foo', as it's a vector of bindings used by 
modules (not a vector of modules).  It also belongs in name-lookup.h.

Pushing to trunk

This adds the vector necessary to hold different module's namespace
bindings.  We add a new tree-node 'tree_binding_vec', which contains a
sparse array, indexed by module number.  To avoid space	wasting, this
is allocated in clusters using 'unsigned short'	as the index value (so
that's one of the upper bounds on module importing).  If there are
only bindings from the current TU, there is no vector, so we have the
same representation as a non-module compilation.

To support lazy	loading, a binding slot	can contain either a tree (the
binding), or a cookie that the module machinery uses to load the
required binding on demand.

The first 2 or 3 slots end up being reserved for fixed meanings.
There are a couple of flags we have to record on a binding, to know
whether the same declaration could appear in two different slots.

         gcc/cp/
	* cp-tree.def (BINDING_VECTOR): New.
	* name-lookup.h (struct binding_slot): New.
	(BINDING_VECTOR_SLOTS_PER_CLUSTER): New.
	(struct binding_index, struct binding_cluster): New.
	(BINDING_VECTOR_ALLOC_CLUSTERS, BINDING_VECTOR_CLUSTER_BASE)
	(BINDING_VECTOR_CLUSTER): New.
	(struct tree_binding_vec): New.
	(BINDING_VECTOR_NAME, BINDING_VECTOR_GLOBAL_DUPS_P)
	(BINDING_VECTOR_PARTITION_DUPS_P): New.
	(BINDING_BINDING_GLOBAL_P, BINDING_BINDING_PARTITION_P): New.
	(BINDING_VECTOR_PENDING_SPECIALIZATIONS)
	(BINDING_VECTOR_PENDING_IS_HEADER_P)
	(BINDING_VECTOR_PENDING_IS_PARTITION_P): New.
	* cp-tree.h (enum cp_tree_node_structure_enum): Add
	TS_CP_BINDING_VECTOR.
	(union lang_tree_node): Add binding_vec field.
	(make_binding_vec): Declare.
	(named_decl_hash::hash, named_decl_hash::equal): Check for binding
	vector.
	* decl.c (cp_tree_node_structure): Add BINDING_VECTOR case.
	* ptree.c (cxx_print_xnode): Add BINDING_VECTOR case.
	* tree.c (make_binding_vec): New.

Comments

Jason Merrill Dec. 11, 2020, 8:39 p.m. UTC | #1
On 12/2/20 9:19 AM, Nathan Sidwell wrote:
> This is modified from the original patch.  firstly it contains a few 
> more necessary bits.  But mainly I renamed the entities from 
> 'MODULE_foo' to 'BINDING_foo', as it's a vector of bindings used by 
> modules (not a vector of modules).  It also belongs in name-lookup.h.
> 
> Pushing to trunk
> 
> This adds the vector necessary to hold different module's namespace
> bindings.  We add a new tree-node 'tree_binding_vec', which contains a
> sparse array, indexed by module number.  To avoid space    wasting, this
> is allocated in clusters using 'unsigned short'    as the index value (so
> that's one of the upper bounds on module importing).  If there are
> only bindings from the current TU, there is no vector, so we have the
> same representation as a non-module compilation.
> 
> To support lazy    loading, a binding slot    can contain either a tree 
> (the
> binding), or a cookie that the module machinery uses to load the
> required binding on demand.
> 
> The first 2 or 3 slots end up being reserved for fixed meanings.
> There are a couple of flags we have to record on a binding, to know
> whether the same declaration could appear in two different slots.
> 
>          gcc/cp/
>      * cp-tree.def (BINDING_VECTOR): New.
>      * name-lookup.h (struct binding_slot): New.
>      (BINDING_VECTOR_SLOTS_PER_CLUSTER): New.
>      (struct binding_index, struct binding_cluster): New.
>      (BINDING_VECTOR_ALLOC_CLUSTERS, BINDING_VECTOR_CLUSTER_BASE)
>      (BINDING_VECTOR_CLUSTER): New.
>      (struct tree_binding_vec): New.
>      (BINDING_VECTOR_NAME, BINDING_VECTOR_GLOBAL_DUPS_P)
>      (BINDING_VECTOR_PARTITION_DUPS_P): New.
>      (BINDING_BINDING_GLOBAL_P, BINDING_BINDING_PARTITION_P): New.
>      (BINDING_VECTOR_PENDING_SPECIALIZATIONS)
>      (BINDING_VECTOR_PENDING_IS_HEADER_P)
>      (BINDING_VECTOR_PENDING_IS_PARTITION_P): New.
>      * cp-tree.h (enum cp_tree_node_structure_enum): Add
>      TS_CP_BINDING_VECTOR.
>      (union lang_tree_node): Add binding_vec field.
>      (make_binding_vec): Declare.
>      (named_decl_hash::hash, named_decl_hash::equal): Check for binding
>      vector.
>      * decl.c (cp_tree_node_structure): Add BINDING_VECTOR case.
>      * ptree.c (cxx_print_xnode): Add BINDING_VECTOR case.
>      * tree.c (make_binding_vec): New.
...
> +extern tree make_binding_vec			(tree, unsigned clusters);
...
> +make_binding_vec (tree name, unsigned clusters MEM_STAT_DECL)

You need the MEM_STAT decoration on the declaration as well, or building 
with --enable-gather-detailed-mem-stats breaks.
Nathan Sidwell Dec. 14, 2020, 1:37 p.m. UTC | #2
On 12/11/20 3:39 PM, Jason Merrill wrote:
> On 12/2/20 9:19 AM, Nathan Sidwell wrote:
>> This is modified from the original patch.  firstly it contains a few 
>> more necessary bits.  But mainly I renamed the entities from 

>> +extern tree make_binding_vec            (tree, unsigned clusters);
> ...
>> +make_binding_vec (tree name, unsigned clusters MEM_STAT_DECL)
> 
> You need the MEM_STAT decoration on the declaration as well, or building 
> with --enable-gather-detailed-mem-stats breaks.

oops, thanks for fixing

nathan
diff mbox series

Patch

diff --git i/gcc/cp/cp-tree.def w/gcc/cp/cp-tree.def
index a188576013b..4e73e46b4a0 100644
--- i/gcc/cp/cp-tree.def
+++ w/gcc/cp/cp-tree.def
@@ -233,6 +233,9 @@  DEFTREECODE (TEMPLATE_ID_EXPR, "template_id_expr", tcc_expression, 2)
 /* One of a set of overloaded functions.  */
 DEFTREECODE (OVERLOAD, "overload", tcc_exceptional, 0)
 
+/* A vector of binding slots.  */
+DEFTREECODE (BINDING_VECTOR, "binding_vector", tcc_exceptional, 0)
+
 /* A pseudo-destructor, of the form "OBJECT.~DESTRUCTOR" or
    "OBJECT.SCOPE::~DESTRUCTOR.  The first operand is the OBJECT.  The
    second operand (if non-NULL) is the SCOPE.  The third operand is
diff --git i/gcc/cp/cp-tree.h w/gcc/cp/cp-tree.h
index 021de76e142..b5d4fc0aba1 100644
--- i/gcc/cp/cp-tree.h
+++ w/gcc/cp/cp-tree.h
@@ -1665,6 +1665,7 @@  enum cp_tree_node_structure_enum {
   TS_CP_TPI,
   TS_CP_PTRMEM,
   TS_CP_OVERLOAD,
+  TS_CP_BINDING_VECTOR,
   TS_CP_BASELINK,
   TS_CP_TEMPLATE_DECL,
   TS_CP_DEFERRED_PARSE,
@@ -1686,6 +1687,7 @@  union GTY((desc ("cp_tree_node_structure (&%h)"),
   struct template_parm_index GTY ((tag ("TS_CP_TPI"))) tpi;
   struct ptrmem_cst GTY ((tag ("TS_CP_PTRMEM"))) ptrmem;
   struct tree_overload GTY ((tag ("TS_CP_OVERLOAD"))) overload;
+  struct tree_binding_vec GTY ((tag ("TS_CP_BINDING_VECTOR"))) binding_vec;
   struct tree_baselink GTY ((tag ("TS_CP_BASELINK"))) baselink;
   struct tree_template_decl GTY ((tag ("TS_CP_TEMPLATE_DECL"))) template_decl;
   struct tree_deferred_parse GTY ((tag ("TS_CP_DEFERRED_PARSE"))) deferred_parse;
@@ -7386,6 +7388,7 @@  extern tree hash_tree_cons			(tree, tree, tree);
 extern tree hash_tree_chain			(tree, tree);
 extern tree build_qualified_name		(tree, tree, tree, bool);
 extern tree build_ref_qualified_type		(tree, cp_ref_qualifier);
+extern tree make_binding_vec			(tree, unsigned clusters);
 inline tree ovl_first				(tree) ATTRIBUTE_PURE;
 extern tree ovl_make				(tree fn,
 						 tree next = NULL_TREE);
@@ -8020,14 +8023,16 @@  type_unknown_p (const_tree expr)
 inline hashval_t
 named_decl_hash::hash (const value_type decl)
 {
-  tree name = OVL_NAME (decl);
+  tree name = (TREE_CODE (decl) == BINDING_VECTOR
+	       ? BINDING_VECTOR_NAME (decl) : OVL_NAME (decl));
   return name ? IDENTIFIER_HASH_VALUE (name) : 0;
 }
 
 inline bool
 named_decl_hash::equal (const value_type existing, compare_type candidate)
 {
-  tree name = OVL_NAME (existing);
+  tree name = (TREE_CODE (existing) == BINDING_VECTOR
+	       ? BINDING_VECTOR_NAME (existing) : OVL_NAME (existing));
   return candidate == name;
 }
 
diff --git i/gcc/cp/decl.c w/gcc/cp/decl.c
index 3dd4b076582..df76155a243 100644
--- i/gcc/cp/decl.c
+++ w/gcc/cp/decl.c
@@ -17594,6 +17594,7 @@  cp_tree_node_structure (union lang_tree_node * t)
     case DEFERRED_PARSE:	return TS_CP_DEFERRED_PARSE;
     case IDENTIFIER_NODE:	return TS_CP_IDENTIFIER;
     case LAMBDA_EXPR:		return TS_CP_LAMBDA_EXPR;
+    case BINDING_VECTOR:		return TS_CP_BINDING_VECTOR;
     case OVERLOAD:		return TS_CP_OVERLOAD;
     case PTRMEM_CST:		return TS_CP_PTRMEM;
     case STATIC_ASSERT:		return TS_CP_STATIC_ASSERT;
diff --git i/gcc/cp/name-lookup.h w/gcc/cp/name-lookup.h
index 6d18539e730..9cc8a42f4bd 100644
--- i/gcc/cp/name-lookup.h
+++ w/gcc/cp/name-lookup.h
@@ -68,6 +68,125 @@  struct GTY(()) cxx_saved_binding {
   tree real_type_value;
 };
 
+/* To support lazy module loading, we squirrel away a section number
+   (and a couple of flags) in the binding slot of unloaded bindings.
+   We rely on pointers being aligned and setting the bottom bit to
+   mark a lazy value.  GTY doesn't like an array of union, so we have
+   a containing struct.  */
+
+struct GTY(()) binding_slot {
+  union GTY((desc ("%1.is_lazy ()"))) binding_slot_lazy {
+    tree GTY((tag ("false"))) binding;
+  } u;
+
+  operator tree & ()
+  {
+    gcc_checking_assert (!is_lazy ());
+    return u.binding;
+  }
+  binding_slot &operator= (tree t)
+  {
+    u.binding = t;
+    return *this;
+  }
+  bool is_lazy () const
+  {
+    return bool (uintptr_t (u.binding) & 1);
+  }
+  void set_lazy (unsigned snum)
+  {
+    gcc_checking_assert (!u.binding);
+    u.binding = tree (uintptr_t ((snum << 1) | 1));
+  }
+  void or_lazy (unsigned snum)
+  {
+    gcc_checking_assert (is_lazy ());
+    u.binding = tree (uintptr_t (u.binding) | (snum << 1));
+  }
+  unsigned get_lazy () const
+  {
+    gcc_checking_assert (is_lazy ());
+    return unsigned (uintptr_t (u.binding) >> 1);
+  }
+};
+
+/* Bindings for modules are held in a sparse array.  There is always a
+   current TU slot, others are allocated as needed.  By construction
+   of the importing mechanism we only ever need to append to the
+   array.  Rather than have straight index/slot tuples, we bunch them
+   up for greater packing.
+
+   The cluster representation packs well on a 64-bit system.  */
+
+#define BINDING_VECTOR_SLOTS_PER_CLUSTER 2
+struct binding_index {
+  unsigned short base;
+  unsigned short span;
+};
+
+struct GTY(()) binding_cluster
+{
+  binding_index GTY((skip)) indices[BINDING_VECTOR_SLOTS_PER_CLUSTER];
+  binding_slot slots[BINDING_VECTOR_SLOTS_PER_CLUSTER];
+};
+
+/* These two fields overlay lang flags.  So don't use those.  */
+#define BINDING_VECTOR_ALLOC_CLUSTERS(NODE) \
+  (BINDING_VECTOR_CHECK (NODE)->base.u.dependence_info.clique)
+#define BINDING_VECTOR_NUM_CLUSTERS(NODE) \
+  (BINDING_VECTOR_CHECK (NODE)->base.u.dependence_info.base)
+#define BINDING_VECTOR_CLUSTER_BASE(NODE) \
+  (((tree_binding_vec *)BINDING_VECTOR_CHECK (NODE))->vec)
+#define BINDING_VECTOR_CLUSTER_LAST(NODE) \
+  (&BINDING_VECTOR_CLUSTER (NODE, BINDING_VECTOR_NUM_CLUSTERS (NODE) - 1))
+#define BINDING_VECTOR_CLUSTER(NODE,IX) \
+  (((tree_binding_vec *)BINDING_VECTOR_CHECK (NODE))->vec[IX])
+
+struct GTY(()) tree_binding_vec {
+  struct tree_base base;
+  tree name;
+  binding_cluster GTY((length ("%h.base.u.dependence_info.base"))) vec[1];
+};
+
+/* The name of a module vector.  */
+#define BINDING_VECTOR_NAME(NODE) \
+  (((tree_binding_vec *)BINDING_VECTOR_CHECK (NODE))->name)
+
+/* tree_binding_vec does uses  base.u.dependence_info.base field for
+   length.  It does not have lang_flag etc available!  */
+
+/* These two flags note if a module-vector contains deduplicated
+   bindings (i.e. multiple declarations in different imports).  */
+/* This binding contains duplicate references to a global module
+   entity.  */
+#define BINDING_VECTOR_GLOBAL_DUPS_P(NODE) \
+  (BINDING_VECTOR_CHECK (NODE)->base.static_flag)
+/* This binding contains duplicate references to a partioned module
+   entity.  */
+#define BINDING_VECTOR_PARTITION_DUPS_P(NODE) \
+  (BINDING_VECTOR_CHECK (NODE)->base.volatile_flag)
+
+/* These two flags indicate the provenence of the bindings on this
+   particular vector slot.  We can of course determine this from slot
+   number, but that's a relatively expensive lookup.  This avoids
+   that when iterating.  */
+/* This slot is part of the global module (a header unit).  */
+#define MODULE_BINDING_GLOBAL_P(NODE) \
+  (OVERLOAD_CHECK (NODE)->base.static_flag)
+/* This slot is part of the current module (a partition or primary).  */
+#define MODULE_BINDING_PARTITION_P(NODE)		\
+  (OVERLOAD_CHECK (NODE)->base.volatile_flag)
+
+/* There are specializations of a template keyed to this binding.  */
+#define BINDING_VECTOR_PENDING_SPECIALIZATIONS_P(NODE) \
+  (BINDING_VECTOR_CHECK (NODE)->base.public_flag)
+/* The key is in a header unit (not a named module partition or
+   primary).  */
+#define BINDING_VECTOR_PENDING_IS_HEADER_P(NODE) \
+  (BINDING_VECTOR_CHECK (NODE)->base.protected_flag)
+/* The key is in a named module (primary or partition).  */
+#define BINDING_VECTOR_PENDING_IS_PARTITION_P(NODE) \
+  (BINDING_VECTOR_CHECK (NODE)->base.private_flag)
 
 extern tree identifier_type_value (tree);
 extern void set_identifier_type_value (tree, tree);
diff --git i/gcc/cp/ptree.c w/gcc/cp/ptree.c
index a28b722f571..1ee107f23cc 100644
--- i/gcc/cp/ptree.c
+++ w/gcc/cp/ptree.c
@@ -250,6 +250,44 @@  cxx_print_xnode (FILE *file, tree node, int indent)
       print_node (file, "function", OVL_FUNCTION (node), indent + 4);
       print_node (file, "next", OVL_CHAIN (node), indent + 4);
       break;
+    case BINDING_VECTOR:
+      {
+	unsigned len = BINDING_VECTOR_NUM_CLUSTERS (node);
+	print_node (file, "name", BINDING_VECTOR_NAME (node), indent + 4);
+	fprintf (file, " clusters %u, alloc %u", len,
+		 BINDING_VECTOR_ALLOC_CLUSTERS (node));
+	for (unsigned ix = 0; ix != len; ix++)
+	  {
+	    binding_cluster *cluster = &BINDING_VECTOR_CLUSTER (node, ix);
+	    char pfx[20];
+	    for (unsigned jx = 0; jx != BINDING_VECTOR_SLOTS_PER_CLUSTER; jx++)
+	      if (cluster->indices[jx].span)
+		{
+		  int len = sprintf (pfx, "module:%u",
+				     cluster->indices[jx].base);
+		  if (cluster->indices[jx].span > 1)
+		    len
+		      += sprintf (&pfx[len], "(+%u)", cluster->indices[jx].span);
+		  len += sprintf (&pfx[len], " cluster:%u/%u", ix, jx);
+		  binding_slot &slot = cluster->slots[jx];
+		  if (slot.is_lazy ())
+		    {
+		      indent_to (file, indent + 4);
+		      unsigned lazy = slot.get_lazy ();
+		      fprintf (file, "%s snum:%u flags:%d",
+			       pfx, lazy >> 2, lazy & 3);
+		    }
+		  else if (slot)
+		    print_node (file, pfx, slot, indent + 4);
+		  else
+		    {
+		      indent_to (file, indent + 4);
+		      fprintf (file, "%s NULL", pfx);
+		    }
+		}
+	  }
+      }
+      break;
     case TEMPLATE_PARM_INDEX:
       print_node (file, "decl", TEMPLATE_PARM_DECL (node), indent+4);
       indent_to (file, indent + 3);
diff --git i/gcc/cp/tree.c w/gcc/cp/tree.c
index 28e591086b3..4113a34b0c1 100644
--- i/gcc/cp/tree.c
+++ w/gcc/cp/tree.c
@@ -2218,6 +2218,23 @@  build_ref_qualified_type (tree type, cp_ref_qualifier rqual)
   return build_cp_fntype_variant (type, rqual, raises, late);
 }
 
+tree
+make_binding_vec (tree name, unsigned clusters MEM_STAT_DECL)
+{
+  /* Stored in an unsigned short, but we're limited to the number of
+     modules anyway.  */
+  gcc_checking_assert (clusters <= (unsigned short)(~0));
+  size_t length = (offsetof (tree_binding_vec, vec)
+		   + clusters * sizeof (binding_cluster));
+  tree vec = ggc_alloc_cleared_tree_node_stat (length PASS_MEM_STAT);
+  TREE_SET_CODE (vec, BINDING_VECTOR);
+  BINDING_VECTOR_NAME (vec) = name;
+  BINDING_VECTOR_ALLOC_CLUSTERS (vec) = clusters;
+  BINDING_VECTOR_NUM_CLUSTERS (vec) = 0;
+
+  return vec;
+}
+
 /* Make a raw overload node containing FN.  */
 
 tree