diff mbox series

Fix bootstrap with GCC 3.4 (and likely anything < 4.4 except 4.2.[0-3]) (PR bootstrap/84405)

Message ID 20180216083843.GN5867@tucnak
State New
Headers show
Series Fix bootstrap with GCC 3.4 (and likely anything < 4.4 except 4.2.[0-3]) (PR bootstrap/84405) | expand

Commit Message

Jakub Jelinek Feb. 16, 2018, 8:38 a.m. UTC
Hi!

As the system.h comment says, it took us years to get value initialization
right, and unfortunately we've started hitting it in GCC codebase now.
There are various related PRs, 4.2.[0-3] are most broken in this regard,
value initialization of non-PODs doesn't actually initialize there anything,
but https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84405#c5 shows a testcase
where any GCC < 4.3 just calls the default ctor for the
*entries = value_type ();
in hash-table.h even when value_type has no user provided ctor, and one
pointer field and another field with user provided ctor.  We want to
zero initialize the pointer and default initialize the other field, but
< 4.3 just default initializes the other field and pointer (m_key) remains
uninitialized.
The vec_default_construct case can hit another bug in < 4.4, if we have a
similar class with multiple fields and at least one of them with user provided
ctor, and at least one of them without user provided ctor, the new T ()
default initializes it rather than value initializes it.

The following patch provides sufficient workaround so that GCC 3.4.6 as a
system compiler can bootstrap GCC (haven't tested other versions).  It is
not enough to get GCC 4.2.[0-3] usable as bootstrap compiler, either we add
2-3 further workarounds, or just reject GCC >= 4.2.0 and <= 4.2.3 in
configure.

Ok for trunk?

2018-02-16  Jakub Jelinek  <jakub@redhat.com>

	PR bootstrap/84405
	* system.h (BROKEN_VALUE_INITIALIZATION): Define for GCC < 4.3.
	* vec.h (vec_default_construct): Use memset instead of placement new
	if BROKEN_VALUE_INITIALIZATION is defined.
	* hash-table.h (hash_table<Descriptor, Allocator>::empty_slow): Use
	memset instead of value initialization if BROKEN_VALUE_INITIALIZATION
	is defined.


	Jakub

Comments

Jakub Jelinek Feb. 16, 2018, 9 a.m. UTC | #1
On Fri, Feb 16, 2018 at 09:48:25AM +0100, Richard Biener wrote:
> Ok.  I'll see if fixing libcpp makes GCC 4.2.[0-3] usable as well.

Looking through r249234, dump_file_info is ok, because it is a POD,
ipcp_value/ipa_edge_args too, because it has a user provided ctor that
initializes everything, wide_int has a user provided ctor that does nothing
(so the memcpy -> placement new change actually removed the zero clearing
is just needed for pedantic reasons), the line-maps.h change we know doesn't
work with 4.2.[0-3] and that is pretty much it (other changes don't involve
value initialization).

	Jakub
Martin Sebor Feb. 16, 2018, 5:36 p.m. UTC | #2
On 02/16/2018 01:38 AM, Jakub Jelinek wrote:
> Hi!
>
> As the system.h comment says, it took us years to get value initialization
> right, and unfortunately we've started hitting it in GCC codebase now.
> There are various related PRs, 4.2.[0-3] are most broken in this regard,
> value initialization of non-PODs doesn't actually initialize there anything,
> but https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84405#c5 shows a testcase
> where any GCC < 4.3 just calls the default ctor for the
> *entries = value_type ();
> in hash-table.h even when value_type has no user provided ctor, and one
> pointer field and another field with user provided ctor.  We want to
> zero initialize the pointer and default initialize the other field, but
> < 4.3 just default initializes the other field and pointer (m_key) remains
> uninitialized.
> The vec_default_construct case can hit another bug in < 4.4, if we have a
> similar class with multiple fields and at least one of them with user provided
> ctor, and at least one of them without user provided ctor, the new T ()
> default initializes it rather than value initializes it.
>
> The following patch provides sufficient workaround so that GCC 3.4.6 as a
> system compiler can bootstrap GCC (haven't tested other versions).  It is
> not enough to get GCC 4.2.[0-3] usable as bootstrap compiler, either we add
> 2-3 further workarounds, or just reject GCC >= 4.2.0 and <= 4.2.3 in
> configure.

The deeper problem is that vec wants to be a POD but GCC
instantiates the template on non-POD types.  For example
genrecog.c instantiates it on struct parameter that has
a default ctor.

A number of ipa-*.c files instantiate vec on 
ipa_polymorphic_call_context, which also has a default ctor,
and ctor even has different semantics than memset (it sets
a few members to non-zero).  So calling memset here instead
of the default ctor could be a potential bug.

Martin

>
> Ok for trunk?
>
> 2018-02-16  Jakub Jelinek  <jakub@redhat.com>
>
> 	PR bootstrap/84405
> 	* system.h (BROKEN_VALUE_INITIALIZATION): Define for GCC < 4.3.
> 	* vec.h (vec_default_construct): Use memset instead of placement new
> 	if BROKEN_VALUE_INITIALIZATION is defined.
> 	* hash-table.h (hash_table<Descriptor, Allocator>::empty_slow): Use
> 	memset instead of value initialization if BROKEN_VALUE_INITIALIZATION
> 	is defined.
>
> --- gcc/system.h.jj	2018-01-03 10:19:54.530533857 +0100
> +++ gcc/system.h	2018-02-15 21:56:37.168139415 +0100
> @@ -824,6 +824,12 @@ extern void fancy_abort (const char *, i
>  /* Some compilers do not allow the use of unsigned char in bitfields.  */
>  #define BOOL_BITFIELD unsigned int
>
> +/* GCC older than 4.4 have broken C++ value initialization handling, see
> +   PR11309, PR30111, PR33916, PR82939 and PR84405 for more details.  */
> +#if GCC_VERSION > 0 && GCC_VERSION < 4004 && !defined(__clang__)
> +# define BROKEN_VALUE_INITIALIZATION
> +#endif
> +
>  /* As the last action in this file, we poison the identifiers that
>     shouldn't be used.  Note, luckily gcc-3.0's token-based integrated
>     preprocessor won't trip on poisoned identifiers that arrive from
> --- gcc/vec.h.jj	2018-01-03 10:19:54.606533869 +0100
> +++ gcc/vec.h	2018-02-15 22:00:36.386216876 +0100
> @@ -490,8 +490,12 @@ template <typename T>
>  inline void
>  vec_default_construct (T *dst, unsigned n)
>  {
> +#ifndef BROKEN_VALUE_INITIALIZATION
>    for ( ; n; ++dst, --n)
>      ::new (static_cast<void*>(dst)) T ();
> +#else
> +  memset (dst, '\0', sizeof (T) * n);
> +#endif
>  }
>
>  /* Copy-construct N elements in DST from *SRC.  */
> --- gcc/hash-table.h.jj	2018-01-03 10:19:54.899533916 +0100
> +++ gcc/hash-table.h	2018-02-15 21:58:14.449170915 +0100
> @@ -804,8 +804,12 @@ hash_table<Descriptor, Allocator>::empty
>      }
>    else
>      {
> +#ifndef BROKEN_VALUE_INITIALIZATION
>        for ( ; size; ++entries, --size)
>  	*entries = value_type ();
> +#else
> +      memset (entries, 0, size * sizeof (value_type));
> +#endif
>      }
>    m_n_deleted = 0;
>    m_n_elements = 0;
>
> 	Jakub
>
Jakub Jelinek Feb. 16, 2018, 6:05 p.m. UTC | #3
On Fri, Feb 16, 2018 at 10:36:27AM -0700, Martin Sebor wrote:
> The deeper problem is that vec wants to be a POD but GCC
> instantiates the template on non-POD types.  For example
> genrecog.c instantiates it on struct parameter that has
> a default ctor.
> 
> A number of ipa-*.c files instantiate vec on ipa_polymorphic_call_context,
> which also has a default ctor,
> and ctor even has different semantics than memset (it sets
> a few members to non-zero).  So calling memset here instead
> of the default ctor could be a potential bug.

Then we should just call memset and then do the placement new for older
compilers (those certainly didn't have -flifetime-dse=2, so wouldn't throw
away whatever has been there before the ctor), or just fix up
ipa_polymorphic_call_context so that default ctor clears everything rather
than setting something to true.

In any case, I hope we don't grow too many new uses of value initialization
which is so problematic in older compilers.

Here is what I'll test.  AFAIK both the *something = value_init ();
and ::new (ptr) value_init (); work even in the buggy compilers, including
4.2.[0-3], if value_init has user defined default ctor, just call
that default ctor and do nothing else, the mishandled cases are only for
classes without those.  So it will be enough to make sure we don't create
hash maps of types that need something non-zero in default ctor or put such
types into classes without user provided default ctor.

2018-02-16  Jakub Jelinek  <jakub@redhat.com>

	PR bootstrap/84405
	* vec.h (vec_default_construct): For BROKEN_VALUE_INITIALIZATION use
	memset and value initialization afterwards.
	* hash-table.h (hash_table<Descriptor, Allocator>::empty_slow):
	Likewise.

--- gcc/hash-table.h.jj	2018-02-16 10:04:25.179740963 +0100
+++ gcc/hash-table.h	2018-02-16 18:48:38.359845266 +0100
@@ -804,12 +804,11 @@ hash_table<Descriptor, Allocator>::empty
     }
   else
     {
-#ifndef BROKEN_VALUE_INITIALIZATION
-      for ( ; size; ++entries, --size)
-	*entries = value_type ();
-#else
+#ifdef BROKEN_VALUE_INITIALIZATION
       memset (entries, 0, size * sizeof (value_type));
 #endif
+      for ( ; size; ++entries, --size)
+	*entries = value_type ();
     }
   m_n_deleted = 0;
   m_n_elements = 0;
--- gcc/vec.h.jj	2018-02-16 10:04:25.178740963 +0100
+++ gcc/vec.h	2018-02-16 18:49:21.305850964 +0100
@@ -490,12 +490,11 @@ template <typename T>
 inline void
 vec_default_construct (T *dst, unsigned n)
 {
-#ifndef BROKEN_VALUE_INITIALIZATION
-  for ( ; n; ++dst, --n)
-    ::new (static_cast<void*>(dst)) T ();
-#else
+#ifdef BROKEN_VALUE_INITIALIZATION
   memset (dst, '\0', sizeof (T) * n);
 #endif
+  for ( ; n; ++dst, --n)
+    ::new (static_cast<void*>(dst)) T ();
 }
 
 /* Copy-construct N elements in DST from *SRC.  */


	Jakub
Martin Sebor Feb. 16, 2018, 7:47 p.m. UTC | #4
On 02/16/2018 11:05 AM, Jakub Jelinek wrote:
> On Fri, Feb 16, 2018 at 10:36:27AM -0700, Martin Sebor wrote:
>> The deeper problem is that vec wants to be a POD but GCC
>> instantiates the template on non-POD types.  For example
>> genrecog.c instantiates it on struct parameter that has
>> a default ctor.
>>
>> A number of ipa-*.c files instantiate vec on ipa_polymorphic_call_context,
>> which also has a default ctor,
>> and ctor even has different semantics than memset (it sets
>> a few members to non-zero).  So calling memset here instead
>> of the default ctor could be a potential bug.
>
> Then we should just call memset and then do the placement new for older
> compilers (those certainly didn't have -flifetime-dse=2, so wouldn't throw
> away whatever has been there before the ctor), or just fix up
> ipa_polymorphic_call_context so that default ctor clears everything rather
> than setting something to true.
>
> In any case, I hope we don't grow too many new uses of value initialization
> which is so problematic in older compilers.
>
> Here is what I'll test.  AFAIK both the *something = value_init ();
> and ::new (ptr) value_init (); work even in the buggy compilers, including
> 4.2.[0-3], if value_init has user defined default ctor, just call
> that default ctor and do nothing else, the mishandled cases are only for
> classes without those.  So it will be enough to make sure we don't create
> hash maps of types that need something non-zero in default ctor or put such
> types into classes without user provided default ctor.

As a workaround for the GCC 4.2 bug it makes sense to me.

There was a proposal not so long ago to adopt C++ 11 that, if
I recall, had broad support.  That would make the minimum GCC
version 4.8 and the question of relying on value initialization
(hopefully) moot.  It should also make it easier to make use
of classes like vec with near-trivial types with ctors.  I'm
hoping it can be revisited in stage 1.  Ultimately, we need
to decide if vec really needs to be a POD (or trivial or
whatever), and if so, avoid instantiating it on non-PODs
(or non-trivial whatevers) and add a static assertion to
its interface to enforce it.

Martin

>
> 2018-02-16  Jakub Jelinek  <jakub@redhat.com>
>
> 	PR bootstrap/84405
> 	* vec.h (vec_default_construct): For BROKEN_VALUE_INITIALIZATION use
> 	memset and value initialization afterwards.
> 	* hash-table.h (hash_table<Descriptor, Allocator>::empty_slow):
> 	Likewise.
>
> --- gcc/hash-table.h.jj	2018-02-16 10:04:25.179740963 +0100
> +++ gcc/hash-table.h	2018-02-16 18:48:38.359845266 +0100
> @@ -804,12 +804,11 @@ hash_table<Descriptor, Allocator>::empty
>      }
>    else
>      {
> -#ifndef BROKEN_VALUE_INITIALIZATION
> -      for ( ; size; ++entries, --size)
> -	*entries = value_type ();
> -#else
> +#ifdef BROKEN_VALUE_INITIALIZATION
>        memset (entries, 0, size * sizeof (value_type));
>  #endif
> +      for ( ; size; ++entries, --size)
> +	*entries = value_type ();
>      }
>    m_n_deleted = 0;
>    m_n_elements = 0;
> --- gcc/vec.h.jj	2018-02-16 10:04:25.178740963 +0100
> +++ gcc/vec.h	2018-02-16 18:49:21.305850964 +0100
> @@ -490,12 +490,11 @@ template <typename T>
>  inline void
>  vec_default_construct (T *dst, unsigned n)
>  {
> -#ifndef BROKEN_VALUE_INITIALIZATION
> -  for ( ; n; ++dst, --n)
> -    ::new (static_cast<void*>(dst)) T ();
> -#else
> +#ifdef BROKEN_VALUE_INITIALIZATION
>    memset (dst, '\0', sizeof (T) * n);
>  #endif
> +  for ( ; n; ++dst, --n)
> +    ::new (static_cast<void*>(dst)) T ();
>  }
>
>  /* Copy-construct N elements in DST from *SRC.  */
>
>
> 	Jakub
>
Jakub Jelinek Feb. 16, 2018, 7:52 p.m. UTC | #5
On Fri, Feb 16, 2018 at 12:47:31PM -0700, Martin Sebor wrote:
> hoping it can be revisited in stage 1.  Ultimately, we need
> to decide if vec really needs to be a POD (or trivial or

Yes, definitely.  This has been discussed several times.

	Jakub
Richard Biener Feb. 16, 2018, 8:25 p.m. UTC | #6
On February 16, 2018 7:05:10 PM GMT+01:00, Jakub Jelinek <jakub@redhat.com> wrote:
>On Fri, Feb 16, 2018 at 10:36:27AM -0700, Martin Sebor wrote:
>> The deeper problem is that vec wants to be a POD but GCC
>> instantiates the template on non-POD types.  For example
>> genrecog.c instantiates it on struct parameter that has
>> a default ctor.
>> 
>> A number of ipa-*.c files instantiate vec on
>ipa_polymorphic_call_context,
>> which also has a default ctor,
>> and ctor even has different semantics than memset (it sets
>> a few members to non-zero).  So calling memset here instead
>> of the default ctor could be a potential bug.
>
>Then we should just call memset and then do the placement new for older
>compilers (those certainly didn't have -flifetime-dse=2, so wouldn't
>throw
>away whatever has been there before the ctor), or just fix up
>ipa_polymorphic_call_context so that default ctor clears everything
>rather
>than setting something to true.
>
>In any case, I hope we don't grow too many new uses of value
>initialization
>which is so problematic in older compilers.
>
>Here is what I'll test.  AFAIK both the *something = value_init ();
>and ::new (ptr) value_init (); work even in the buggy compilers,
>including
>4.2.[0-3], if value_init has user defined default ctor, just call
>that default ctor and do nothing else, the mishandled cases are only
>for
>classes without those.  So it will be enough to make sure we don't
>create
>hash maps of types that need something non-zero in default ctor or put
>such
>types into classes without user provided default ctor.

But the broken compilers will overwrite the memset data with possibly uninitialized fields of a TARGET_EXPR. 

>
>2018-02-16  Jakub Jelinek  <jakub@redhat.com>
>
>	PR bootstrap/84405
>	* vec.h (vec_default_construct): For BROKEN_VALUE_INITIALIZATION use
>	memset and value initialization afterwards.
>	* hash-table.h (hash_table<Descriptor, Allocator>::empty_slow):
>	Likewise.
>
>--- gcc/hash-table.h.jj	2018-02-16 10:04:25.179740963 +0100
>+++ gcc/hash-table.h	2018-02-16 18:48:38.359845266 +0100
>@@ -804,12 +804,11 @@ hash_table<Descriptor, Allocator>::empty
>     }
>   else
>     {
>-#ifndef BROKEN_VALUE_INITIALIZATION
>-      for ( ; size; ++entries, --size)
>-	*entries = value_type ();
>-#else
>+#ifdef BROKEN_VALUE_INITIALIZATION
>       memset (entries, 0, size * sizeof (value_type));
> #endif
>+      for ( ; size; ++entries, --size)
>+	*entries = value_type ();
>     }
>   m_n_deleted = 0;
>   m_n_elements = 0;
>--- gcc/vec.h.jj	2018-02-16 10:04:25.178740963 +0100
>+++ gcc/vec.h	2018-02-16 18:49:21.305850964 +0100
>@@ -490,12 +490,11 @@ template <typename T>
> inline void
> vec_default_construct (T *dst, unsigned n)
> {
>-#ifndef BROKEN_VALUE_INITIALIZATION
>-  for ( ; n; ++dst, --n)
>-    ::new (static_cast<void*>(dst)) T ();
>-#else
>+#ifdef BROKEN_VALUE_INITIALIZATION
>   memset (dst, '\0', sizeof (T) * n);
> #endif
>+  for ( ; n; ++dst, --n)
>+    ::new (static_cast<void*>(dst)) T ();
> }
> 
> /* Copy-construct N elements in DST from *SRC.  */
>
>
>	Jakub
Richard Biener Feb. 16, 2018, 8:28 p.m. UTC | #7
On February 16, 2018 8:52:18 PM GMT+01:00, Jakub Jelinek <jakub@redhat.com> wrote:
>On Fri, Feb 16, 2018 at 12:47:31PM -0700, Martin Sebor wrote:
>> hoping it can be revisited in stage 1.  Ultimately, we need
>> to decide if vec really needs to be a POD (or trivial or
>
>Yes, definitely.  This has been discussed several times.

IIRC this original requirement came from vec embedded into GCed data structures and the GC machinery not dealing with that correctly. Not sure if that was fully resolved or if it was solved efficiently enough to make widespread use of it. 

Richard. 

>	Jakub
Jakub Jelinek Feb. 16, 2018, 8:35 p.m. UTC | #8
On Fri, Feb 16, 2018 at 09:25:30PM +0100, Richard Biener wrote:
> But the broken compilers will overwrite the memset data with possibly uninitialized fields of a TARGET_EXPR. 

For the vec.h case no, there is no temporary involved, it is constructed
directly into the memory.

For the hash-table.h you're right, so we shouldn't change anything,
perhaps we can add something based on C++11 type_traits conditionally
to determine what ctors to look at (std::is_pod false,
std::is_trivially_constructible true?)?

	Jakub
Jakub Jelinek Feb. 16, 2018, 8:44 p.m. UTC | #9
On Fri, Feb 16, 2018 at 09:25:30PM +0100, Richard Biener wrote:
> But the broken compilers will overwrite the memset data with possibly uninitialized fields of a TARGET_EXPR. 

Perhaps for hash-table.h we could use:
#ifndef BROKEN_VALUE_INITIALIZATION
      for ( ; size; ++entries, --size)
	*entries = value_type ();
#else
      union U { char c[sizeof (value_type)]; value_type v; } u;
      memset (u.c, '\0', sizeof (u.c));
      value_type *p = ::new (static_cast<void*>(u.c)) value_type ();
      for ( ; size; ++entries, --size)
	*entries = *p;
      p->~value_type ();
#endif
or so, if it is valid C++, which could turn the hash-table.h case into
the vec.h case.

	Jakub
Martin Sebor Feb. 16, 2018, 9:14 p.m. UTC | #10
On 02/16/2018 01:44 PM, Jakub Jelinek wrote:
> On Fri, Feb 16, 2018 at 09:25:30PM +0100, Richard Biener wrote:
>> But the broken compilers will overwrite the memset data with possibly uninitialized fields of a TARGET_EXPR.
>
> Perhaps for hash-table.h we could use:
> #ifndef BROKEN_VALUE_INITIALIZATION
>       for ( ; size; ++entries, --size)
> 	*entries = value_type ();
> #else
>       union U { char c[sizeof (value_type)]; value_type v; } u;

In C++ 98 a type with a constructor cannot be a member of a union
so the above wouldn't be valid but if I understand what you're
trying to do you don't need the member.  All you need is a buffer.
Something like this?

   char __attribute__ ((aligned (__alignof__ (value_type))))
   buf [sizeof (value_type)] = "";
   value_type *p = new (buf) value_type ();
   *entries = *p;
   p->value_type ();

I have no idea if this gets around the bug.

Martin

>       memset (u.c, '\0', sizeof (u.c));
>       value_type *p = ::new (static_cast<void*>(u.c)) value_type ();
>       for ( ; size; ++entries, --size)
> 	*entries = *p;
>       p->~value_type ();
> #endif
> or so, if it is valid C++, which could turn the hash-table.h case into
> the vec.h case.
>
> 	Jakub
>
Jakub Jelinek Feb. 16, 2018, 9:31 p.m. UTC | #11
On Fri, Feb 16, 2018 at 02:14:09PM -0700, Martin Sebor wrote:
> On 02/16/2018 01:44 PM, Jakub Jelinek wrote:
> > On Fri, Feb 16, 2018 at 09:25:30PM +0100, Richard Biener wrote:
> > > But the broken compilers will overwrite the memset data with possibly uninitialized fields of a TARGET_EXPR.
> > 
> > Perhaps for hash-table.h we could use:
> > #ifndef BROKEN_VALUE_INITIALIZATION
> >       for ( ; size; ++entries, --size)
> > 	*entries = value_type ();
> > #else
> >       union U { char c[sizeof (value_type)]; value_type v; } u;
> 
> In C++ 98 a type with a constructor cannot be a member of a union
> so the above wouldn't be valid but if I understand what you're
> trying to do you don't need the member.  All you need is a buffer.
> Something like this?
> 
>   char __attribute__ ((aligned (__alignof__ (value_type))))
>   buf [sizeof (value_type)] = "";
>   value_type *p = new (buf) value_type ();

Rather
    for ( ; size; ++entries, --size)
      *entries = *p;
    p->~value_type ();

I was trying to avoid the __aligned__ attribute and __alignof__, but
I guess we could guard that with GCC_VERSION >= 2000 or something similar,
after all, the documented minimum is GCC 3.4 I think.

>   *entries = *p;
>   p->value_type ();
> 
> I have no idea if this gets around the bug.

	Jakub
Jakub Jelinek Feb. 17, 2018, 9:08 a.m. UTC | #12
On Fri, Feb 16, 2018 at 02:14:09PM -0700, Martin Sebor wrote:
> On 02/16/2018 01:44 PM, Jakub Jelinek wrote:
> > On Fri, Feb 16, 2018 at 09:25:30PM +0100, Richard Biener wrote:
> > > But the broken compilers will overwrite the memset data with possibly uninitialized fields of a TARGET_EXPR.
> > 
> > Perhaps for hash-table.h we could use:
> > #ifndef BROKEN_VALUE_INITIALIZATION
> >       for ( ; size; ++entries, --size)
> > 	*entries = value_type ();
> > #else
> >       union U { char c[sizeof (value_type)]; value_type v; } u;
> 
> In C++ 98 a type with a constructor cannot be a member of a union
> so the above wouldn't be valid but if I understand what you're
> trying to do you don't need the member.  All you need is a buffer.
> Something like this?
> 
>   char __attribute__ ((aligned (__alignof__ (value_type))))
>   buf [sizeof (value_type)] = "";

Note, this doesn't work, as G++ <= 4.2 rejects:
template <typename value_type>
struct S
{
  char __attribute__ ((__aligned__ (__alignof__ (value_type)))) buf[sizeof (value_type)];
};

S<int> s;

with:
error: requested alignment is not a constant

The following works though, and has been successfully bootstrapped/regtested
on x86_64-linux and i686-linux.  It will fail if we use hash_table on
overaligned value_type type with alignment > 64, but that is not completely
unlikely on any target I'm aware of.  It uses __alignof__ which should be
supported by the system GCC's we support for which
BROKEN_VALUE_INITIALIZATION is true (i.e. >= 3.4 and <= 4.3), non-GCC
compilers shouldn't get here due to GCC_VERSION != 0 unless they pretend to
be GCC compatible.

Bootstrapped with system gcc34/g++34 on x86_64-linux, ok for trunk?

2018-02-17  Jakub Jelinek  <jakub@redhat.com>

	PR bootstrap/84405
	* vec.h (vec_default_construct): For BROKEN_VALUE_INITIALIZATION use
	memset and value initialization afterwards.
	* hash-table.h (hash_table<Descriptor, Allocator>::empty_slow): For
	BROKEN_VALUE_INITIALIZATION use placement new into a previously
	cleared appropriately aligned buffer.

--- gcc/vec.h.jj	2018-02-16 19:39:03.637058918 +0100
+++ gcc/vec.h	2018-02-16 23:51:55.389582056 +0100
@@ -490,12 +490,11 @@ template <typename T>
 inline void
 vec_default_construct (T *dst, unsigned n)
 {
-#ifndef BROKEN_VALUE_INITIALIZATION
-  for ( ; n; ++dst, --n)
-    ::new (static_cast<void*>(dst)) T ();
-#else
+#ifdef BROKEN_VALUE_INITIALIZATION
   memset (dst, '\0', sizeof (T) * n);
 #endif
+  for ( ; n; ++dst, --n)
+    ::new (static_cast<void*>(dst)) T ();
 }
 
 /* Copy-construct N elements in DST from *SRC.  */
--- gcc/hash-table.h.jj	2018-02-16 19:39:03.660058934 +0100
+++ gcc/hash-table.h	2018-02-17 00:01:24.207723925 +0100
@@ -808,7 +808,18 @@ hash_table<Descriptor, Allocator>::empty
       for ( ; size; ++entries, --size)
 	*entries = value_type ();
 #else
-      memset (entries, 0, size * sizeof (value_type));
+      /* To workaround older GCC versions which don't handle value
+	 initialization right, use a placement new into previously
+	 cleared buffer.  */
+      char buf[2 * sizeof (value_type) + 64];
+      gcc_assert (__alignof__ (value_type) <= sizeof (value_type) + 64);
+      char *q = (buf + sizeof (value_type) + 64
+		 - ((uintptr_t) buf % __alignof__ (value_type)));
+      memset (q, '\0', sizeof (value_type));
+      value_type *p = ::new (q) value_type ();
+      for ( ; size; ++entries, --size)
+	*entries = *p;
+      p->~value_type ();
 #endif
     }
   m_n_deleted = 0;


	Jakub
Richard Biener Feb. 17, 2018, 6:58 p.m. UTC | #13
On February 16, 2018 9:44:33 PM GMT+01:00, Jakub Jelinek <jakub@redhat.com> wrote:
>On Fri, Feb 16, 2018 at 09:25:30PM +0100, Richard Biener wrote:
>> But the broken compilers will overwrite the memset data with possibly
>uninitialized fields of a TARGET_EXPR. 
>
>Perhaps for hash-table.h we could use:
>#ifndef BROKEN_VALUE_INITIALIZATION
>      for ( ; size; ++entries, --size)
>	*entries = value_type ();
>#else
>      union U { char c[sizeof (value_type)]; value_type v; } u;
>      memset (u.c, '\0', sizeof (u.c));
>      value_type *p = ::new (static_cast<void*>(u.c)) value_type ();
>      for ( ; size; ++entries, --size)
>	*entries = *p;
>      p->~value_type ();
>#endif
>or so, if it is valid C++, which could turn the hash-table.h case into
>the vec.h case.

I don't think a workaround for a non-conforming compiler has to be strictly conforming. And the more complicated we make this the bigger the chances we hit some other latent bug. 

Richard. 

>	Jakub
Jakub Jelinek Feb. 26, 2018, 12:10 p.m. UTC | #14
Hi!

On Sat, Feb 17, 2018 at 10:08:48AM +0100, Jakub Jelinek wrote:
> Note, this doesn't work, as G++ <= 4.2 rejects:
> template <typename value_type>
> struct S
> {
>   char __attribute__ ((__aligned__ (__alignof__ (value_type)))) buf[sizeof (value_type)];
> };
> 
> S<int> s;
> 
> with:
> error: requested alignment is not a constant
> 
> The following works though, and has been successfully bootstrapped/regtested
> on x86_64-linux and i686-linux.  It will fail if we use hash_table on
> overaligned value_type type with alignment > 64, but that is not completely
> unlikely on any target I'm aware of.  It uses __alignof__ which should be
> supported by the system GCC's we support for which
> BROKEN_VALUE_INITIALIZATION is true (i.e. >= 3.4 and <= 4.3), non-GCC
> compilers shouldn't get here due to GCC_VERSION != 0 unless they pretend to
> be GCC compatible.
> 
> Bootstrapped with system gcc34/g++34 on x86_64-linux, ok for trunk?
> 
> 2018-02-17  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR bootstrap/84405
> 	* vec.h (vec_default_construct): For BROKEN_VALUE_INITIALIZATION use
> 	memset and value initialization afterwards.
> 	* hash-table.h (hash_table<Descriptor, Allocator>::empty_slow): For
> 	BROKEN_VALUE_INITIALIZATION use placement new into a previously
> 	cleared appropriately aligned buffer.

I'd like to ping this patch, or if it isn't acceptable, at least the vec.h
part of it which isn't really more complex than what we have right now.

> --- gcc/vec.h.jj	2018-02-16 19:39:03.637058918 +0100
> +++ gcc/vec.h	2018-02-16 23:51:55.389582056 +0100
> @@ -490,12 +490,11 @@ template <typename T>
>  inline void
>  vec_default_construct (T *dst, unsigned n)
>  {
> -#ifndef BROKEN_VALUE_INITIALIZATION
> -  for ( ; n; ++dst, --n)
> -    ::new (static_cast<void*>(dst)) T ();
> -#else
> +#ifdef BROKEN_VALUE_INITIALIZATION
>    memset (dst, '\0', sizeof (T) * n);
>  #endif
> +  for ( ; n; ++dst, --n)
> +    ::new (static_cast<void*>(dst)) T ();
>  }
>  
>  /* Copy-construct N elements in DST from *SRC.  */
> --- gcc/hash-table.h.jj	2018-02-16 19:39:03.660058934 +0100
> +++ gcc/hash-table.h	2018-02-17 00:01:24.207723925 +0100
> @@ -808,7 +808,18 @@ hash_table<Descriptor, Allocator>::empty
>        for ( ; size; ++entries, --size)
>  	*entries = value_type ();
>  #else
> -      memset (entries, 0, size * sizeof (value_type));
> +      /* To workaround older GCC versions which don't handle value
> +	 initialization right, use a placement new into previously
> +	 cleared buffer.  */
> +      char buf[2 * sizeof (value_type) + 64];
> +      gcc_assert (__alignof__ (value_type) <= sizeof (value_type) + 64);
> +      char *q = (buf + sizeof (value_type) + 64
> +		 - ((uintptr_t) buf % __alignof__ (value_type)));
> +      memset (q, '\0', sizeof (value_type));
> +      value_type *p = ::new (q) value_type ();
> +      for ( ; size; ++entries, --size)
> +	*entries = *p;
> +      p->~value_type ();
>  #endif
>      }
>    m_n_deleted = 0;

	Jakub
Richard Biener Feb. 26, 2018, 12:15 p.m. UTC | #15
On Mon, 26 Feb 2018, Jakub Jelinek wrote:

> Hi!
> 
> On Sat, Feb 17, 2018 at 10:08:48AM +0100, Jakub Jelinek wrote:
> > Note, this doesn't work, as G++ <= 4.2 rejects:
> > template <typename value_type>
> > struct S
> > {
> >   char __attribute__ ((__aligned__ (__alignof__ (value_type)))) buf[sizeof (value_type)];
> > };
> > 
> > S<int> s;
> > 
> > with:
> > error: requested alignment is not a constant
> > 
> > The following works though, and has been successfully bootstrapped/regtested
> > on x86_64-linux and i686-linux.  It will fail if we use hash_table on
> > overaligned value_type type with alignment > 64, but that is not completely
> > unlikely on any target I'm aware of.  It uses __alignof__ which should be
> > supported by the system GCC's we support for which
> > BROKEN_VALUE_INITIALIZATION is true (i.e. >= 3.4 and <= 4.3), non-GCC
> > compilers shouldn't get here due to GCC_VERSION != 0 unless they pretend to
> > be GCC compatible.
> > 
> > Bootstrapped with system gcc34/g++34 on x86_64-linux, ok for trunk?
> > 
> > 2018-02-17  Jakub Jelinek  <jakub@redhat.com>
> > 
> > 	PR bootstrap/84405
> > 	* vec.h (vec_default_construct): For BROKEN_VALUE_INITIALIZATION use
> > 	memset and value initialization afterwards.
> > 	* hash-table.h (hash_table<Descriptor, Allocator>::empty_slow): For
> > 	BROKEN_VALUE_INITIALIZATION use placement new into a previously
> > 	cleared appropriately aligned buffer.
> 
> I'd like to ping this patch, or if it isn't acceptable, at least the vec.h
> part of it which isn't really more complex than what we have right now.

So BROKEN_VALUE_INITIALIZATION doesn't really mean it's "broken" but
that is has certain behavior that doesn't invalidate the clearing
done earlier?  That said, I wonder if first doing the regular
construction and memset afterwards in case of
BROKEN_VALUE_INITIALIZATION wouldn't be more reliable and "easier"?

> > --- gcc/vec.h.jj	2018-02-16 19:39:03.637058918 +0100
> > +++ gcc/vec.h	2018-02-16 23:51:55.389582056 +0100
> > @@ -490,12 +490,11 @@ template <typename T>
> >  inline void
> >  vec_default_construct (T *dst, unsigned n)
> >  {
> > -#ifndef BROKEN_VALUE_INITIALIZATION
> > -  for ( ; n; ++dst, --n)
> > -    ::new (static_cast<void*>(dst)) T ();
> > -#else
> > +#ifdef BROKEN_VALUE_INITIALIZATION
> >    memset (dst, '\0', sizeof (T) * n);
> >  #endif
> > +  for ( ; n; ++dst, --n)
> > +    ::new (static_cast<void*>(dst)) T ();
> >  }
> >  
> >  /* Copy-construct N elements in DST from *SRC.  */
> > --- gcc/hash-table.h.jj	2018-02-16 19:39:03.660058934 +0100
> > +++ gcc/hash-table.h	2018-02-17 00:01:24.207723925 +0100
> > @@ -808,7 +808,18 @@ hash_table<Descriptor, Allocator>::empty
> >        for ( ; size; ++entries, --size)
> >  	*entries = value_type ();
> >  #else
> > -      memset (entries, 0, size * sizeof (value_type));
> > +      /* To workaround older GCC versions which don't handle value
> > +	 initialization right, use a placement new into previously
> > +	 cleared buffer.  */
> > +      char buf[2 * sizeof (value_type) + 64];
> > +      gcc_assert (__alignof__ (value_type) <= sizeof (value_type) + 64);
> > +      char *q = (buf + sizeof (value_type) + 64
> > +		 - ((uintptr_t) buf % __alignof__ (value_type)));
> > +      memset (q, '\0', sizeof (value_type));
> > +      value_type *p = ::new (q) value_type ();
> > +      for ( ; size; ++entries, --size)
> > +	*entries = *p;
> > +      p->~value_type ();
> >  #endif
> >      }
> >    m_n_deleted = 0;
> 
> 	Jakub
> 
>
Jakub Jelinek Feb. 26, 2018, 12:27 p.m. UTC | #16
On Mon, Feb 26, 2018 at 01:15:45PM +0100, Richard Biener wrote:
> > I'd like to ping this patch, or if it isn't acceptable, at least the vec.h
> > part of it which isn't really more complex than what we have right now.
> 
> So BROKEN_VALUE_INITIALIZATION doesn't really mean it's "broken" but
> that is has certain behavior that doesn't invalidate the clearing
> done earlier?  That said, I wonder if first doing the regular
> construction and memset afterwards in case of
> BROKEN_VALUE_INITIALIZATION wouldn't be more reliable and "easier"?

BROKEN_VALUE_INITIALIZATION means the value initialization is broken,
where broken means that for certain cases (for 4.2.[0-3] more than others)
it doesn't initialize the object or some subobjects at all.
memset after wouldn't make any sense, that is equivalent to just memset.
What the patch is meant to fix are cases where even the broken value
initialization invokes a ctor (when T has user defined default ctor).

There are several cases of
vec<ipa_polymorphic_call_context> where ipa_polymorphic_call_context
has user defined default ctor that doesn't clear the object, but instead
sets a bunch of bool flags in it to true.  If we do just memset when
built with older system G++ we get different behavior.

Even with the patch if BROKEN_VALUE_INITIALIZATION we won't handle e.g.
cases where we'd have vec of a struct without user defined default ctor,
but where one of the fields has ipa_polymorphic_call_context type or
some other with non-trivial user defined default ctor that doesn't
clear everything.  We don't have anything like that in the codebase
right now, and if we would, a workaround would be to provide user defined
default ctor for such types if we want to put them into vec.

Unlike the vec.h case, the hash-table.h is just for hypothetical case
where we'd have a hash table with such types.

> > > --- gcc/vec.h.jj	2018-02-16 19:39:03.637058918 +0100
> > > +++ gcc/vec.h	2018-02-16 23:51:55.389582056 +0100
> > > @@ -490,12 +490,11 @@ template <typename T>
> > >  inline void
> > >  vec_default_construct (T *dst, unsigned n)
> > >  {
> > > -#ifndef BROKEN_VALUE_INITIALIZATION
> > > -  for ( ; n; ++dst, --n)
> > > -    ::new (static_cast<void*>(dst)) T ();
> > > -#else
> > > +#ifdef BROKEN_VALUE_INITIALIZATION
> > >    memset (dst, '\0', sizeof (T) * n);
> > >  #endif
> > > +  for ( ; n; ++dst, --n)
> > > +    ::new (static_cast<void*>(dst)) T ();
> > >  }
> > >  
> > >  /* Copy-construct N elements in DST from *SRC.  */
> > > --- gcc/hash-table.h.jj	2018-02-16 19:39:03.660058934 +0100
> > > +++ gcc/hash-table.h	2018-02-17 00:01:24.207723925 +0100
> > > @@ -808,7 +808,18 @@ hash_table<Descriptor, Allocator>::empty
> > >        for ( ; size; ++entries, --size)
> > >  	*entries = value_type ();
> > >  #else
> > > -      memset (entries, 0, size * sizeof (value_type));
> > > +      /* To workaround older GCC versions which don't handle value
> > > +	 initialization right, use a placement new into previously
> > > +	 cleared buffer.  */
> > > +      char buf[2 * sizeof (value_type) + 64];
> > > +      gcc_assert (__alignof__ (value_type) <= sizeof (value_type) + 64);
> > > +      char *q = (buf + sizeof (value_type) + 64
> > > +		 - ((uintptr_t) buf % __alignof__ (value_type)));
> > > +      memset (q, '\0', sizeof (value_type));
> > > +      value_type *p = ::new (q) value_type ();
> > > +      for ( ; size; ++entries, --size)
> > > +	*entries = *p;
> > > +      p->~value_type ();
> > >  #endif
> > >      }
> > >    m_n_deleted = 0;

	Jakub
Richard Biener Feb. 26, 2018, 12:31 p.m. UTC | #17
On Mon, 26 Feb 2018, Jakub Jelinek wrote:

> On Mon, Feb 26, 2018 at 01:15:45PM +0100, Richard Biener wrote:
> > > I'd like to ping this patch, or if it isn't acceptable, at least the vec.h
> > > part of it which isn't really more complex than what we have right now.
> > 
> > So BROKEN_VALUE_INITIALIZATION doesn't really mean it's "broken" but
> > that is has certain behavior that doesn't invalidate the clearing
> > done earlier?  That said, I wonder if first doing the regular
> > construction and memset afterwards in case of
> > BROKEN_VALUE_INITIALIZATION wouldn't be more reliable and "easier"?
> 
> BROKEN_VALUE_INITIALIZATION means the value initialization is broken,
> where broken means that for certain cases (for 4.2.[0-3] more than others)
> it doesn't initialize the object or some subobjects at all.
> memset after wouldn't make any sense, that is equivalent to just memset.
> What the patch is meant to fix are cases where even the broken value
> initialization invokes a ctor (when T has user defined default ctor).
> 
> There are several cases of
> vec<ipa_polymorphic_call_context> where ipa_polymorphic_call_context
> has user defined default ctor that doesn't clear the object, but instead
> sets a bunch of bool flags in it to true.  If we do just memset when
> built with older system G++ we get different behavior.
> 
> Even with the patch if BROKEN_VALUE_INITIALIZATION we won't handle e.g.
> cases where we'd have vec of a struct without user defined default ctor,
> but where one of the fields has ipa_polymorphic_call_context type or
> some other with non-trivial user defined default ctor that doesn't
> clear everything.  We don't have anything like that in the codebase
> right now, and if we would, a workaround would be to provide user defined
> default ctor for such types if we want to put them into vec.
> 
> Unlike the vec.h case, the hash-table.h is just for hypothetical case
> where we'd have a hash table with such types.

Ok, I see.  So the vec.h change is ok but please put a comment there
reflecting the above.

IMHO we should wait for the hypothetical case to happen for hash-table.h.

I still hope we can stop caring about too old GCC in a few years ;)

Richard.

> > > > --- gcc/vec.h.jj	2018-02-16 19:39:03.637058918 +0100
> > > > +++ gcc/vec.h	2018-02-16 23:51:55.389582056 +0100
> > > > @@ -490,12 +490,11 @@ template <typename T>
> > > >  inline void
> > > >  vec_default_construct (T *dst, unsigned n)
> > > >  {
> > > > -#ifndef BROKEN_VALUE_INITIALIZATION
> > > > -  for ( ; n; ++dst, --n)
> > > > -    ::new (static_cast<void*>(dst)) T ();
> > > > -#else
> > > > +#ifdef BROKEN_VALUE_INITIALIZATION
> > > >    memset (dst, '\0', sizeof (T) * n);
> > > >  #endif
> > > > +  for ( ; n; ++dst, --n)
> > > > +    ::new (static_cast<void*>(dst)) T ();
> > > >  }
> > > >  
> > > >  /* Copy-construct N elements in DST from *SRC.  */
> > > > --- gcc/hash-table.h.jj	2018-02-16 19:39:03.660058934 +0100
> > > > +++ gcc/hash-table.h	2018-02-17 00:01:24.207723925 +0100
> > > > @@ -808,7 +808,18 @@ hash_table<Descriptor, Allocator>::empty
> > > >        for ( ; size; ++entries, --size)
> > > >  	*entries = value_type ();
> > > >  #else
> > > > -      memset (entries, 0, size * sizeof (value_type));
> > > > +      /* To workaround older GCC versions which don't handle value
> > > > +	 initialization right, use a placement new into previously
> > > > +	 cleared buffer.  */
> > > > +      char buf[2 * sizeof (value_type) + 64];
> > > > +      gcc_assert (__alignof__ (value_type) <= sizeof (value_type) + 64);
> > > > +      char *q = (buf + sizeof (value_type) + 64
> > > > +		 - ((uintptr_t) buf % __alignof__ (value_type)));
> > > > +      memset (q, '\0', sizeof (value_type));
> > > > +      value_type *p = ::new (q) value_type ();
> > > > +      for ( ; size; ++entries, --size)
> > > > +	*entries = *p;
> > > > +      p->~value_type ();
> > > >  #endif
> > > >      }
> > > >    m_n_deleted = 0;
> 
> 	Jakub
> 
>
diff mbox series

Patch

--- gcc/system.h.jj	2018-01-03 10:19:54.530533857 +0100
+++ gcc/system.h	2018-02-15 21:56:37.168139415 +0100
@@ -824,6 +824,12 @@  extern void fancy_abort (const char *, i
 /* Some compilers do not allow the use of unsigned char in bitfields.  */
 #define BOOL_BITFIELD unsigned int
 
+/* GCC older than 4.4 have broken C++ value initialization handling, see
+   PR11309, PR30111, PR33916, PR82939 and PR84405 for more details.  */
+#if GCC_VERSION > 0 && GCC_VERSION < 4004 && !defined(__clang__)
+# define BROKEN_VALUE_INITIALIZATION
+#endif
+
 /* As the last action in this file, we poison the identifiers that
    shouldn't be used.  Note, luckily gcc-3.0's token-based integrated
    preprocessor won't trip on poisoned identifiers that arrive from
--- gcc/vec.h.jj	2018-01-03 10:19:54.606533869 +0100
+++ gcc/vec.h	2018-02-15 22:00:36.386216876 +0100
@@ -490,8 +490,12 @@  template <typename T>
 inline void
 vec_default_construct (T *dst, unsigned n)
 {
+#ifndef BROKEN_VALUE_INITIALIZATION
   for ( ; n; ++dst, --n)
     ::new (static_cast<void*>(dst)) T ();
+#else
+  memset (dst, '\0', sizeof (T) * n);
+#endif
 }
 
 /* Copy-construct N elements in DST from *SRC.  */
--- gcc/hash-table.h.jj	2018-01-03 10:19:54.899533916 +0100
+++ gcc/hash-table.h	2018-02-15 21:58:14.449170915 +0100
@@ -804,8 +804,12 @@  hash_table<Descriptor, Allocator>::empty
     }
   else
     {
+#ifndef BROKEN_VALUE_INITIALIZATION
       for ( ; size; ++entries, --size)
 	*entries = value_type ();
+#else
+      memset (entries, 0, size * sizeof (value_type));
+#endif
     }
   m_n_deleted = 0;
   m_n_elements = 0;