Patchwork [Ada] fix potential memory corruption in annotated value cache

login
register
mail settings
Submitter Alexandre Oliva
Date Sept. 16, 2011, 7:02 a.m.
Message ID <orzki59dh3.fsf@livre.localdomain>
Download mbox | patch
Permalink /patch/114892/
State New
Headers show

Comments

Alexandre Oliva - Sept. 16, 2011, 7:02 a.m.
A -fcompare-debug regression in s-regexp.adb on x86_64-linux-gnu turned
out to be caused by a hashtable management error in annotate_value().
We ask for an insertion and leave the allocated slot empty while
proceeding to other computations that might (a) return without filling
it in, or (b) recurse and allocate the same slot or (c) grow and move
the table.

(a) is a performance problem because it might render cached values
invisible, should the allocated slot be formerly deleted, thus breaking
a rehash chain.

(b) is also a performance problem, because when the context that first
allocated the slot proceeds to fill it in, it may override another
cached value that happened to be assigned the same slot.  This is what
caused the -fcompare-debug difference: the annotation of a value had its
cache entry overridden by an upstream caller in only one of the
compilations, so a subsequent call of annotate_value with the same value
resulted in a use of the cached value in one, and an expansion that
remapped new decls in the other.  This out-of-sync decl numbering ended
up causing different symbol names to be chosen within
lhd_set_decl_assembler_name().

(c) is the scariest possibility: if the hash table that holds cached
values grows to the point of being moved during recursion, and upon
return we fill in the pointer in the slot that is in the old (possibly
reused) storage, we may be corrupting internal compiler state.

Some possible fixes I considered were:

1. inserting on entry (as is), allocating the cache entry right away,
and *always* filling it before returning

2. inserting on entry (as is), allocating the cache entry right away,
and releasing it before returning unless we're filling it in

3. not inserting on entry, and looking up again for insertion before
caching and returning, so as to get a fresh slot pointer

I implemented 3., and considered splitting the logic of annotate_value()
into one function that manages caching and calls the other to perform
the computation, so as to simplify the implementation.

Here's the patch I've tested on i686-pc-linux-gnu and x86_64-linux-gnu.
Ok to install?
Jakub Jelinek - Sept. 16, 2011, 8:22 a.m.
On Fri, Sep 16, 2011 at 04:02:32AM -0300, Alexandre Oliva wrote:
> -      struct tree_int_map in;
> +      struct tree_int_map **h;
> +
>        if (!annotate_value_cache)
>          annotate_value_cache = htab_create_ggc (512, tree_int_map_hash,
>  					        tree_int_map_eq, 0);
>        in.base.from = gnu_size;
>        h = (struct tree_int_map **)
> -	    htab_find_slot (annotate_value_cache, &in, INSERT);
> +	    htab_find_slot (annotate_value_cache, &in, NO_INSERT);

I wonder why don't you use htab_find instead here.

> -      if (*h)
> +      if (h)
>  	return (Node_Ref_Or_Val) (*h)->to;
>      }

	Jakub
Eric Botcazou - Sept. 16, 2011, 8:45 p.m.
> Some possible fixes I considered were:
>
> 1. inserting on entry (as is), allocating the cache entry right away,
> and *always* filling it before returning
>
> 2. inserting on entry (as is), allocating the cache entry right away,
> and releasing it before returning unless we're filling it in
>
> 3. not inserting on entry, and looking up again for insertion before
> caching and returning, so as to get a fresh slot pointer
>
> I implemented 3., and considered splitting the logic of annotate_value()
> into one function that manages caching and calls the other to perform
> the computation, so as to simplify the implementation.

This looks like the most straightforward solution indeed.

> Here's the patch I've tested on i686-pc-linux-gnu and x86_64-linux-gnu.
> Ok to install?

Yes, modulo Jakub's remark and s/NULL/NULL_TREE for zeroing in.base.from.

Patch

for  gcc/ada/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	* gcc-interface/decl.c (annotate_value): Look up expression for
	insertion in the cache at the end.

Index: gcc/ada/gcc-interface/decl.c
===================================================================
--- gcc/ada/gcc-interface/decl.c.orig	2011-09-15 03:51:42.984761174 -0300
+++ gcc/ada/gcc-interface/decl.c	2011-09-15 03:51:44.698733097 -0300
@@ -7471,23 +7471,26 @@  annotate_value (tree gnu_size)
 {
   TCode tcode;
   Node_Ref_Or_Val ops[3], ret;
-  struct tree_int_map **h = NULL;
+  struct tree_int_map in;
   int i;
 
   /* See if we've already saved the value for this node.  */
   if (EXPR_P (gnu_size))
     {
-      struct tree_int_map in;
+      struct tree_int_map **h;
+
       if (!annotate_value_cache)
         annotate_value_cache = htab_create_ggc (512, tree_int_map_hash,
 					        tree_int_map_eq, 0);
       in.base.from = gnu_size;
       h = (struct tree_int_map **)
-	    htab_find_slot (annotate_value_cache, &in, INSERT);
+	    htab_find_slot (annotate_value_cache, &in, NO_INSERT);
 
-      if (*h)
+      if (h)
 	return (Node_Ref_Or_Val) (*h)->to;
     }
+  else
+    in.base.from = NULL;
 
   /* If we do not return inside this switch, TCODE will be set to the
      code to use for a Create_Node operand and LEN (set above) will be
@@ -7588,8 +7591,17 @@  annotate_value (tree gnu_size)
   ret = Create_Node (tcode, ops[0], ops[1], ops[2]);
 
   /* Save the result in the cache.  */
-  if (h)
+  if (in.base.from)
     {
+      struct tree_int_map **h;
+      /* We can't assume the hash table data hasn't moved since the
+	 initial look up, so we have to search again.  Allocating and
+	 inserting an entry at that point would be an alternative, but
+	 then we'd better discard the entry if we decided not to cache
+	 it.  */
+      h = (struct tree_int_map **)
+	    htab_find_slot (annotate_value_cache, &in, INSERT);
+      gcc_assert (!*h);
       *h = ggc_alloc_tree_int_map ();
       (*h)->base.from = gnu_size;
       (*h)->to = ret;