diff mbox series

avoid invoking assignment on uninitialized objects (PR 92761, 92762)

Message ID 510e890a-36dd-3886-ef19-5c9ce1f6607e@gmail.com
State New
Headers show
Series avoid invoking assignment on uninitialized objects (PR 92761, 92762) | expand

Commit Message

Martin Sebor Dec. 3, 2019, 10:41 p.m. UTC
After allocating a new chunk of memory hash_table::expand() copy-
assigns elements from the current array over the newly allocated
elements without constructing them.

Similarly, after destroying existing elements, hash_table::
empty_slow() assigns a new value to them.  This bug was
introduced in r249234 when trying to deal with -Wclass-memaccess
instances when the warning was first added.

Neither of these is a problem for PODs but both cause trouble when
the hash_table contains elements of a type with a user-defined copy
assignment operator.  There are at least two such instances in GCC
already and a third is under review.

The attached patch avoids this by using placement new to construct
new elements in uninitialized storage and restoring the original
call to memset in hash_table::empty_slow(), analogously to what
was done in r272893 for PR90923,

Longer term, to make these templates safely and efficiently usable
with non-POD types with user-defined copy ctors and copy assignment
operators I think these classes will probably need to be enhanced
to make use of "assign" and "move" traits functions to efficiently
assign and move objects.

Martin

Comments

Jeff Law Dec. 7, 2019, 4:15 p.m. UTC | #1
On Tue, 2019-12-03 at 15:41 -0700, Martin Sebor wrote:
> PR middle-end/92761 - hash_table::expand invokes assignment on invalid objects
> PR middle-end/92762 - hash_table::empty_slow invokes assignment on invalid objects
> gcc/ChangeLog:
> 
> 	PR middle-end/92761
> 	PR middle-end/92762
> 	* hash-map-tests.c (test_map_of_type_with_ctor_and_dtor): Tighten
> 	up tests.
> 	* hash-table.h (hash_table::expand): Use placement new to copy
> 	construct objects in uninitialized storage.
> 	(hash_table::empty_slow): Avoid invoking copy assignment on
> 	uninitialized objects.
OK
jeff
David Malcolm Dec. 10, 2019, 10:07 p.m. UTC | #2
On Tue, 2019-12-03 at 15:41 -0700, Martin Sebor wrote:
> After allocating a new chunk of memory hash_table::expand() copy-
> assigns elements from the current array over the newly allocated
> elements without constructing them.
> 
> Similarly, after destroying existing elements, hash_table::
> empty_slow() assigns a new value to them.  This bug was
> introduced in r249234 when trying to deal with -Wclass-memaccess
> instances when the warning was first added.
> 
> Neither of these is a problem for PODs but both cause trouble when
> the hash_table contains elements of a type with a user-defined copy
> assignment operator.  There are at least two such instances in GCC
> already and a third is under review.
> 
> The attached patch avoids this by using placement new to construct
> new elements in uninitialized storage and restoring the original
> call to memset in hash_table::empty_slow(), analogously to what
> was done in r272893 for PR90923,
> 
> Longer term, to make these templates safely and efficiently usable
> with non-POD types with user-defined copy ctors and copy assignment
> operators I think these classes will probably need to be enhanced
> to make use of "assign" and "move" traits functions to efficiently
> assign and move objects.
> 
> Martin

> diff --git a/gcc/hash-table.h b/gcc/hash-table.h
> index ba5d64fb7a3..26bac624a08 100644
> --- a/gcc/hash-table.h
> +++ b/gcc/hash-table.h
> @@ -818,8 +818,7 @@ hash_table<Descriptor, Lazy, Allocator>::expand ()
>        if (!is_empty (x) && !is_deleted (x))
>          {
>            value_type *q = find_empty_slot_for_expand (Descriptor::hash (x));
> -
> -          *q = x;
> +	  new ((void*) q) value_type (x);
>          }
>  
>        p++;
> @@ -869,14 +868,8 @@ hash_table<Descriptor, Lazy, Allocator>::empty_slow ()
>        m_size_prime_index = nindex;
>      }
>    else
> -    {
> -#ifndef BROKEN_VALUE_INITIALIZATION
> -      for ( ; size; ++entries, --size)
> -	*entries = value_type ();
> -#else
> -      memset (entries, 0, size * sizeof (value_type));
> -#endif
> -    }
> +    memset ((void *) entries, 0, size * sizeof (value_type));
> +
>    m_n_deleted = 0;
>    m_n_elements = 0;
>  }

On attempting to rebase my analyzer branch I found that this patch
uncovered a bug in it, but also possibly a bug in hash-table.h.

In the analyzer branch I have a hash_map with a key where the "empty"
value isn't all-zero-bits.

Specifically, in gcc/analyzer/program-state.h: sm_state_map has a
hash_map <svalue_id, entry_t> map_t, where svalue_id, the key type, has
hash traits:

template <>
inline void
pod_hash_traits<svalue_id>::mark_empty (value_type &v)
{
  v = svalue_id::null ();
}

which is a -1 int (all ones).

memset to zero bits populates the "empty" slots with a key with a non-
empty value for this key type, effectively corrupting the data
structure (luckily a selftest caught this).

Looking at the above patch, it looks like I was unwittingly relying on
two things:
(a) #ifndef BROKEN_VALUE_INITIALIZATION, and
(b) that the default ctor for my key type was the "empty" value.

However, shouldn't empty_slow be calling the Descriptor::mark_empty
function?  Doesn't this memset code assume that the "empty" value of
the hash_table entry is all-zeroes (and thus imposing the same
assumption on all hash_maps' key and value types?) - which isn't the
case for my code.  I don't remember seeing that assumption documented.

Or am I misreading this?

Thanks
Dave
Martin Sebor Dec. 10, 2019, 11 p.m. UTC | #3
On 12/10/19 3:07 PM, David Malcolm wrote:
> On Tue, 2019-12-03 at 15:41 -0700, Martin Sebor wrote:
>> After allocating a new chunk of memory hash_table::expand() copy-
>> assigns elements from the current array over the newly allocated
>> elements without constructing them.
>>
>> Similarly, after destroying existing elements, hash_table::
>> empty_slow() assigns a new value to them.  This bug was
>> introduced in r249234 when trying to deal with -Wclass-memaccess
>> instances when the warning was first added.
>>
>> Neither of these is a problem for PODs but both cause trouble when
>> the hash_table contains elements of a type with a user-defined copy
>> assignment operator.  There are at least two such instances in GCC
>> already and a third is under review.
>>
>> The attached patch avoids this by using placement new to construct
>> new elements in uninitialized storage and restoring the original
>> call to memset in hash_table::empty_slow(), analogously to what
>> was done in r272893 for PR90923,
>>
>> Longer term, to make these templates safely and efficiently usable
>> with non-POD types with user-defined copy ctors and copy assignment
>> operators I think these classes will probably need to be enhanced
>> to make use of "assign" and "move" traits functions to efficiently
>> assign and move objects.
>>
>> Martin
> 
>> diff --git a/gcc/hash-table.h b/gcc/hash-table.h
>> index ba5d64fb7a3..26bac624a08 100644
>> --- a/gcc/hash-table.h
>> +++ b/gcc/hash-table.h
>> @@ -818,8 +818,7 @@ hash_table<Descriptor, Lazy, Allocator>::expand ()
>>         if (!is_empty (x) && !is_deleted (x))
>>           {
>>             value_type *q = find_empty_slot_for_expand (Descriptor::hash (x));
>> -
>> -          *q = x;
>> +	  new ((void*) q) value_type (x);
>>           }
>>   
>>         p++;
>> @@ -869,14 +868,8 @@ hash_table<Descriptor, Lazy, Allocator>::empty_slow ()
>>         m_size_prime_index = nindex;
>>       }
>>     else
>> -    {
>> -#ifndef BROKEN_VALUE_INITIALIZATION
>> -      for ( ; size; ++entries, --size)
>> -	*entries = value_type ();
>> -#else
>> -      memset (entries, 0, size * sizeof (value_type));
>> -#endif
>> -    }
>> +    memset ((void *) entries, 0, size * sizeof (value_type));
>> +
>>     m_n_deleted = 0;
>>     m_n_elements = 0;
>>   }
> 
> On attempting to rebase my analyzer branch I found that this patch
> uncovered a bug in it, but also possibly a bug in hash-table.h.
> 
> In the analyzer branch I have a hash_map with a key where the "empty"
> value isn't all-zero-bits.

There's a test case for it in comment #3 in 92762.

> 
> Specifically, in gcc/analyzer/program-state.h: sm_state_map has a
> hash_map <svalue_id, entry_t> map_t, where svalue_id, the key type, has
> hash traits:
> 
> template <>
> inline void
> pod_hash_traits<svalue_id>::mark_empty (value_type &v)
> {
>    v = svalue_id::null ();
> }
> 
> which is a -1 int (all ones).
> 
> memset to zero bits populates the "empty" slots with a key with a non-
> empty value for this key type, effectively corrupting the data
> structure (luckily a selftest caught this).
> 
> Looking at the above patch, it looks like I was unwittingly relying on
> two things:
> (a) #ifndef BROKEN_VALUE_INITIALIZATION, and
> (b) that the default ctor for my key type was the "empty" value.
> 
> However, shouldn't empty_slow be calling the Descriptor::mark_empty
> function?  Doesn't this memset code assume that the "empty" value of
> the hash_table entry is all-zeroes (and thus imposing the same
> assumption on all hash_maps' key and value types?) - which isn't the
> case for my code.  I don't remember seeing that assumption documented.
> 
> Or am I misreading this?

IIRC, I had tried the below but it caused problems during bootstrap
that I didn't spend enough time investigating.

   for (size_t i = size - 1; i < size; i--)
     if (!is_empty (entries[i]) && !is_deleted (entries[i]))
       {
         Descriptor::remove (entries[i]);
         mark_empty (entries[i]);
       }

(I think it also caused compilation error in one of the IPA passes
because of a missing mark_empty function in its traits class; maybe
ipa_bit_ggc_hash_traits in ipa-prop.c).

To be honest, I'm not sure I understand why the memset is even there.
It seems like a leak to me (I can reproduce leaks with a user-defined
type).  I was going to get back to it at some point.

Martin
diff mbox series

Patch

PR middle-end/92761 - hash_table::expand invokes assignment on invalid objects
PR middle-end/92762 - hash_table::empty_slow invokes assignment on invalid objects
gcc/ChangeLog:

	PR middle-end/92761
	PR middle-end/92762
	* hash-map-tests.c (test_map_of_type_with_ctor_and_dtor): Tighten
	up tests.
	* hash-table.h (hash_table::expand): Use placement new to copy
	construct objects in uninitialized storage.
	(hash_table::empty_slow): Avoid invoking copy assignment on
	uninitialized objects.

diff --git a/gcc/hash-map-tests.c b/gcc/hash-map-tests.c
index f3b216be07c..a42eac21ab3 100644
--- a/gcc/hash-map-tests.c
+++ b/gcc/hash-map-tests.c
@@ -117,23 +117,26 @@  public:
     ++ndefault;
   }
 
-  hash_map_test_val_t (const hash_map_test_val_t &)
+  hash_map_test_val_t (const hash_map_test_val_t &rhs)
     : ptr (&ptr)
   {
     ++ncopy;
+    gcc_assert (rhs.ptr == &rhs.ptr);
   }
 
-  hash_map_test_val_t& operator= (const hash_map_test_val_t &)
-    {
-     ++nassign;
-     return *this;
-    }
+  hash_map_test_val_t& operator= (const hash_map_test_val_t &rhs)
+  {
+    ++nassign;
+    gcc_assert (ptr == &ptr);
+    gcc_assert (rhs.ptr == &rhs.ptr);
+    return *this;
+  }
 
   ~hash_map_test_val_t ()
-    {
-     gcc_assert (ptr == &ptr);
-     ++ndtor;
-    }
+  {
+    gcc_assert (ptr == &ptr);
+    ++ndtor;
+  }
 
   void *ptr;
 } val_t;
@@ -184,7 +187,6 @@  test_map_of_type_with_ctor_and_dtor ()
     ASSERT_TRUE (nassign == val_t::nassign);
 
     ASSERT_TRUE (&rv1 != pv2);
-    ASSERT_TRUE (pv2->ptr == &pv2->ptr);
   }
 
   ASSERT_TRUE (val_t::ndefault + val_t::ncopy == val_t::ndtor);
@@ -207,7 +209,6 @@  test_map_of_type_with_ctor_and_dtor ()
     ASSERT_TRUE (nassign + 1 == val_t::nassign);
 
     ASSERT_TRUE (&rv1 != pv2);
-    ASSERT_TRUE (pv2->ptr == &pv2->ptr);
   }
 
   ASSERT_TRUE (val_t::ndefault + val_t::ncopy == val_t::ndtor);
diff --git a/gcc/hash-table.h b/gcc/hash-table.h
index ba5d64fb7a3..26bac624a08 100644
--- a/gcc/hash-table.h
+++ b/gcc/hash-table.h
@@ -818,8 +818,7 @@  hash_table<Descriptor, Lazy, Allocator>::expand ()
       if (!is_empty (x) && !is_deleted (x))
         {
           value_type *q = find_empty_slot_for_expand (Descriptor::hash (x));
-
-          *q = x;
+	  new ((void*) q) value_type (x);
         }
 
       p++;
@@ -869,14 +868,8 @@  hash_table<Descriptor, Lazy, Allocator>::empty_slow ()
       m_size_prime_index = nindex;
     }
   else
-    {
-#ifndef BROKEN_VALUE_INITIALIZATION
-      for ( ; size; ++entries, --size)
-	*entries = value_type ();
-#else
-      memset (entries, 0, size * sizeof (value_type));
-#endif
-    }
+    memset ((void *) entries, 0, size * sizeof (value_type));
+
   m_n_deleted = 0;
   m_n_elements = 0;
 }