diff mbox

warn on mem calls modifying objects of non-trivial types (PR 80560)

Message ID b729a519-fdef-1ad7-3e33-4b3d451d1a1a@gmail.com
State New
Headers show

Commit Message

Martin Sebor April 29, 2017, 8:09 p.m. UTC
Calling memset, memcpy, or similar to write to an object of
a non-trivial type (such as one that defines a ctor or dtor,
or has such a member) can break the invariants otherwise
maintained by the class and cause undefined behavior.

The motivating example that prompted this work was a review of
a change that added to a plain old struct a new member with a ctor
and dtor (in this instance the member was of type std::vector).

To help catch problems of this sort some projects (such as GDB)
have apparently even devised their own clever solutions to detect
them: https://sourceware.org/ml/gdb-patches/2017-04/msg00378.html.

The attached patch adds a new warning, -Wnon-trivial-memaccess,
that has GCC detect these mistakes.  The patch also fixes up
a handful of instances of the problem in GCC.  These instances
correspond to the two patterns below:

   struct A
   {
     void *p;
     void foo (int n) { p = malloc (n); }
     ~A () { free (p); }
   };

   void init (A *a)
   {
     memset (a, 0, sizeof *a);
   }

and

   struct B
   {
     int i;
     ~A ();
   };

   void copy (B *p, const B *q)
   {
     memcpy (p, q, sizeof *p);
     ...
    }

These aren't undefined and the patch could be tweaked to allow
them.  I decided not to invest effort into it because, although
not strictly erroneous, I think they represent poor practice.
The code would be more clearly (and more in the spirit of "good"
C++) written in terms of the default constructor and assignment
operator.  The first one like so:

   *a = A ();

and the second one like so:

   *b = *q;

Martin

Comments

Pedro Alves April 30, 2017, 8:02 p.m. UTC | #1
Hi Martin,

Thanks much for doing this.  A few comments below, in light of my
experience doing the equivalent checks in the gdb patch linked below,
using standard C++11.

On 04/29/2017 09:09 PM, Martin Sebor wrote:
> Calling memset, memcpy, or similar to write to an object of
> a non-trivial type (such as one that defines a ctor or dtor,
> or has such a member) can break the invariants otherwise
> maintained by the class and cause undefined behavior.
> 
> The motivating example that prompted this work was a review of
> a change that added to a plain old struct a new member with a ctor
> and dtor (in this instance the member was of type std::vector).
> 
> To help catch problems of this sort some projects (such as GDB)
> have apparently even devised their own clever solutions to detect
> them: https://sourceware.org/ml/gdb-patches/2017-04/msg00378.html.
> 
> The attached patch adds a new warning, -Wnon-trivial-memaccess,
> that has GCC detect these mistakes.  The patch also fixes up
> a handful of instances of the problem in GCC.  These instances
> correspond to the two patterns below:
> 
>   struct A
>   {
>     void *p;
>     void foo (int n) { p = malloc (n); }
>     ~A () { free (p); }
>   };
> 
>   void init (A *a)
>   {
>     memset (a, 0, sizeof *a);
>   }
> 
> and
> 
>   struct B
>   {
>     int i;
>     ~A ();
>   };

(typo: "~B ();")

> 
>   void copy (B *p, const B *q)
>   {
>     memcpy (p, q, sizeof *p);
>     ...
>    }
> 

IMO the check should be relaxed from "type is trivial" to "type is
trivially copyable" (which is what the gdb detection at
https://sourceware.org/ml/gdb-patches/2017-04/msg00378.html
uses for memcpy/memmove).  Checking that the destination is trivial is
going to generate false positives -- specifically, [basic-types]/3
specifies that it's fine to memcpy trivially _copyable_ types, not
trivial types.  A type can be both non-trivial and trivially copyable
at the same time.  For example, this compiles, but triggers
your new warning:

#include <stdlib.h>
#include <string.h>
#include <type_traits>

struct NonTrivialButTriviallyCopyable
{
  NonTrivialButTriviallyCopyable () : i (0) {}
  int i;
};

static_assert (!std::is_trivial<NonTrivialButTriviallyCopyable>::value, "");
static_assert (std::is_trivially_copyable<NonTrivialButTriviallyCopyable>::value, "");

void copy (NonTrivialButTriviallyCopyable *dst, NonTrivialButTriviallyCopyable *src)
{
  memcpy (dst, src, sizeof (*src));
}

$ /opt/gcc/bin/g++ -std=gnu++11 trivial-warn.cc -o trivial-warn -g3 -O0 -Wall -Wextra -c
trivial-warn.cc: In function ‘void copy(NonTrivialButTriviallyCopyable*, NonTrivialButTriviallyCopyable*)’:
trivial-warn.cc:16:34: warning: calling ‘void* memcpy(void*, const void*, size_t)’ with a pointer to a non-trivial type ‘struct NonTrivialButTriviallyCopyable’ [-Wnon-trivial-memaccess]
   memcpy (dst, src, sizeof (*src));
                                  ^
$

Implementations of vector-like classes can very well (and are
encouraged) to make use of std::is_trivially_copyable to know whether
they can copy a range of elements to new storage
using memcpy/memmove/mempcpy.

Running your patch against GDB trips on such a case:

src/gdb/btrace.h: In function ‘btrace_insn_s* VEC_btrace_insn_s_quick_insert(VEC_btrace_insn_s*, unsigned int, const btrace_insn_s*, const char*, unsigned int)’:
src/gdb/common/vec.h:948:62: error: calling ‘void* memmove(void*, const void*, size_t)’ with a pointer to a non-trivial type ‘btrace_insn_s {aka struct btrace_insn}’ [-Werror=non-trivial-memaccess]
   memmove (slot_ + 1, slot_, (vec_->num++ - ix_) * sizeof (T));    \
                                                              ^

There is nothing wrong with the code being warned here.
While "struct btrace_insn" is trivial (has a user-provided default
ctor), it is still trivially copyable.

Now, this gdb code is using the old VEC (originated from
gcc's C days, it's not the current C++fied VEC implementation),
but the point is that any other random vector-like container out there
is free to optimize copy of a range of non-trivial but trivially
copyable types using memcpy/memmove.

Note that libstdc++ does not actually do that optimization, but
that's just a missed optimization, see PR libstdc++/68350 [1]
"std::uninitialized_copy overly restrictive for
trivially_copyable types".  (libstdc++'s std::vector defers
copy to std::unitialized_copy.)

[1] - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68350

> These aren't undefined and the patch could be tweaked to allow
> them.  

I think they're undefined because the types are not trivially
copyable (non-trivial destructor with side effects).

> I decided not to invest effort into it because, although
> not strictly erroneous, I think they represent poor practice.

I think that for my suggestion, all you really need is use
call trivially_copyable_p instead of trivial_type_p, for
memcpy/memmove/mempcpy at least.

For memset, I'd suggest to go the other direction actually, and
instead of relaxing the trivial check, to tighten it, by warning about
memset'ting objects of non-standard-layout type too.  I.e., warn for
memset of all non-POD (non-trivial + non-standard-layout) types.  That's
what I did in the gdb patch.  That would also produce warnings for memset
of trivial types that via refactoring end up with references, like:

struct Trivial
{
  Trivial () = default;
  Trivial (int &i) : m_i (i) {}
  void add (int howmuch) { m_i += howmuch; }

private:
  int &m_i;
};

void reset (Trivial *triv)
{
  memset (triv, 0, sizeof (Trivial));
}

void recompute (Trivial *triv)
{
  reset (triv); // start over
  triv->add (10); // whoops, null reference.
}

It's also warn for memset of trivial types that aren't standard
layout due to having a mix of public/protected/private fields,
which is likely not a real problem in practice, but I'd call
those a code smell that warrants a warning too:

 struct S
 {
 private:
   int a;
 public:
   int b;
 };

 S s;
 memset (&s, 0, sizeof (S));


Playing with the patch, I noticed that you can't silence
it by casting the pointer to void, but you can with
casting to char, like:

void copy (B *dst, B *src)
{
    memcpy (dst, src, sizeof (*src)); // warns
    memcpy ((void*) dst, (void *) src, sizeof (*src)); // still warns
    memcpy ((char*) dst, (char *) src, sizeof (*src)); // doesn't warn
    memcpy ((unsigned char*) dst, (unsigned char *) src, sizeof (*src)); // doesn't warn
}

I can understand how we end up like that, given char's magic properties, but still
I think many will reach for "void" first if they (really really) need to add a cast to
silence the warning.  In any case, I think it'd be very nice to add cases with such
casts to the new tests, and maybe mention it in the documentation too.

Thanks,
Pedro Alves
Joseph Myers April 30, 2017, 11:39 p.m. UTC | #2
On Sat, 29 Apr 2017, Martin Sebor wrote:

> +The safe way to either initialize or "reset" objects of non-trivial

Should use TeX quotes in Texinfo files, ``reset''.
Jason Merrill May 1, 2017, 3:49 p.m. UTC | #3
Pedro's suggestions all sound good to me.

Jason
Martin Sebor May 3, 2017, 4:07 p.m. UTC | #4
On 04/30/2017 05:39 PM, Joseph Myers wrote:
> On Sat, 29 Apr 2017, Martin Sebor wrote:
>
>> +The safe way to either initialize or "reset" objects of non-trivial
>
> Should use TeX quotes in Texinfo files, ``reset''.

Heh :) I wrestled with Emacs to get rid of those,  It kept replacing
my plain quotes with the TeX kind but in the end I won.  Sounds like
I should have trusted it.  Let me restore it in the follow-on patch.

Martin
Martin Sebor May 11, 2017, 4:23 p.m. UTC | #5
On 04/30/2017 02:02 PM, Pedro Alves wrote:
> Hi Martin,
>
> Thanks much for doing this.  A few comments below, in light of my
> experience doing the equivalent checks in the gdb patch linked below,
> using standard C++11.

Thanks for the feedback!  It gave me quite a bit to think about.

The challenge with using memcpy or memset with class types is
figuring out if it's being called to copy-construct a new object
or assign a new value to an existing one.  In general it's not
possible to tell so the conservative assumption is that it's
the latter.

Because of that, relying on the trivially copyable property to
determine whether it's safe to assign a new value to an object
is not sufficient (or even necessary).  The class must be at
a minimum trivially assignable.  But it turns out that even
trivial assignability as defined by C++ isn't enough.  A class
with a const data member or a member of a reference type is
considered "trivially assignable" but its copy assignment is
deleted, and so it can't be assigned to.   Calling memset or
memcpy to write over an object of such a class can violate
the const invariant or corrupt the reference, respectively.

On the other hand, relying on the standard layout property
to decide whether or not calling memset on an object is safe,
while on the surface reasonable, is at the same time too strict
and overly permissive.  It's too strict because it warns for
calls where the destination is an object of a trivial derived
class that declares data members in one of its bases as well as
in the derived class (GCC has code like that).  It's not strict
enough because it doesn't catch cases where the class contains
only private or only protected data members (GCC is guilty of
abusing memset to violate encapsulation like that as well).

That said, I'm testing a solution that overcomes these problems.
I adjusted it so it doesn't warn on the GDB code in your example
(or any GDB code on trunk), even though in my opinion "tricks"
like that would best be avoided in favor of safer alternatives.

Unlike in C, the preferred way to initialize objects in C++
is to use some form of initialization (as opposed to memset).
The preferred way to copy objects is using the copy ctor or
assignment operator (as opposed to memcpy).  Using the special
member functions is clearer, less prone to mistakes and so
safer, and in some cases can also be more efficient.  Memcpy
and memset should be reserved for manipulating raw storage,
not typed objects.

Martin
Jakub Jelinek May 11, 2017, 4:34 p.m. UTC | #6
On Thu, May 11, 2017 at 10:23:48AM -0600, Martin Sebor wrote:
> Unlike in C, the preferred way to initialize objects in C++
> is to use some form of initialization (as opposed to memset).
> The preferred way to copy objects is using the copy ctor or
> assignment operator (as opposed to memcpy).  Using the special
> member functions is clearer, less prone to mistakes and so
> safer, and in some cases can also be more efficient.  Memcpy
> and memset should be reserved for manipulating raw storage,
> not typed objects.

But optimizers should be able to turn the copy ctors/assignment operators
or default ctors into memcpy or memset where possible, making
it efficient again.  And that is something that doesn't work well yet.

	Jakub
Martin Sebor May 11, 2017, 4:57 p.m. UTC | #7
On 05/11/2017 10:34 AM, Jakub Jelinek wrote:
> On Thu, May 11, 2017 at 10:23:48AM -0600, Martin Sebor wrote:
>> Unlike in C, the preferred way to initialize objects in C++
>> is to use some form of initialization (as opposed to memset).
>> The preferred way to copy objects is using the copy ctor or
>> assignment operator (as opposed to memcpy).  Using the special
>> member functions is clearer, less prone to mistakes and so
>> safer, and in some cases can also be more efficient.  Memcpy
>> and memset should be reserved for manipulating raw storage,
>> not typed objects.
>
> But optimizers should be able to turn the copy ctors/assignment operators
> or default ctors into memcpy or memset where possible, making
> it efficient again.  And that is something that doesn't work well yet.

I was referring to GCC not expanding inline calls to memcpy with
non-const size, as in function g below, whereas in f, the loop is
fully inlined for (likely) better performance.  This is meant to
be close to the GDB example of a vector-like class that uses
memcpy (or memmove) instead of ordinary assignment.

   struct S { char a[32]; };

   void f (S *p, const S *q, unsigned n)
   {
     while (n--)
       *p++ = *q++;
   }

   void g (S *p, const S *q, unsigned n)
   {
     __builtin_memcpy (p, q, sizeof *q * n);
   }

Martin
Jason Merrill May 16, 2017, 7:41 p.m. UTC | #8
On Thu, May 11, 2017 at 12:23 PM, Martin Sebor <msebor@gmail.com> wrote:
> The challenge with using memcpy or memset with class types is
> figuring out if it's being called to copy-construct a new object
> or assign a new value to an existing one.  In general it's not
> possible to tell so the conservative assumption is that it's
> the latter.
>
> Because of that, relying on the trivially copyable property to
> determine whether it's safe to assign a new value to an object
> is not sufficient (or even necessary).  The class must be at
> a minimum trivially assignable.  But it turns out that even
> trivial assignability as defined by C++ isn't enough.  A class
> with a const data member or a member of a reference type is
> considered "trivially assignable" but its copy assignment is
> deleted, and so it can't be assigned to.   Calling memset or
> memcpy to write over an object of such a class can violate
> the const invariant or corrupt the reference, respectively.

Yes, "trivially copyable" elides the differences between
initialization and assignment context.  I agree that it makes sense to
check for a trivial assignment operator specifically.  I guess we want
a slightly stronger "trivially copyable" that also requires a
non-deleted assignment operator.

It seems to me that the relevant tests are:

bcopy/memcpy/memmove want trivally copyable + non-deleted assignment.
bzero/memset want trivial + non-deleted assignment.

I'm still not convinced we need to consider standard-layout at all.

Jason
Martin Sebor May 16, 2017, 9:39 p.m. UTC | #9
On 05/16/2017 01:41 PM, Jason Merrill wrote:
> On Thu, May 11, 2017 at 12:23 PM, Martin Sebor <msebor@gmail.com> wrote:
>> The challenge with using memcpy or memset with class types is
>> figuring out if it's being called to copy-construct a new object
>> or assign a new value to an existing one.  In general it's not
>> possible to tell so the conservative assumption is that it's
>> the latter.
>>
>> Because of that, relying on the trivially copyable property to
>> determine whether it's safe to assign a new value to an object
>> is not sufficient (or even necessary).  The class must be at
>> a minimum trivially assignable.  But it turns out that even
>> trivial assignability as defined by C++ isn't enough.  A class
>> with a const data member or a member of a reference type is
>> considered "trivially assignable" but its copy assignment is
>> deleted, and so it can't be assigned to.   Calling memset or
>> memcpy to write over an object of such a class can violate
>> the const invariant or corrupt the reference, respectively.
>
> Yes, "trivially copyable" elides the differences between
> initialization and assignment context.  I agree that it makes sense to
> check for a trivial assignment operator specifically.  I guess we want
> a slightly stronger "trivially copyable" that also requires a
> non-deleted assignment operator.
>
> It seems to me that the relevant tests are:
>
> bcopy/memcpy/memmove want trivally copyable + non-deleted assignment.
> bzero/memset want trivial + non-deleted assignment.

I think that's very close to what the patch does.

bzero/memset: expects a trivial class with no private
members and with a non-deleted trivial assignment.  That looks
the same as what you have above.

bcopy/memcpy/memmove: expects the same plus trivially copyable,
except that the class may be non-trivial if the source of the
copy is any of void*, [signed or unsigned] char*, or a pointer
to the class itself.  In that case, the byte size must either
be non-constant or a multiple of the size of the object, to
help detect inadvertent partial copies.

The test for class triviality is to detect misusing the functions
to initialize (construct) an object of a class with a user-defined
default ctor by copying bytes into it from an object of unrelated
type (void* and char* are allowed for serialization).

I did forget about bcopy.  Let me add it.

>
> I'm still not convinced we need to consider standard-layout at all.

I agree.  The patch doesn't make use of is_standard_layout_p().
It defines its own helper called almost_std_layout_p() that
combines trivial_type_p() with tests for non-public members,
ignoring the requirement that all members of a derived standard
layout class must be defined in the same base (or derived) class.

Is there something you'd like to see done differently in the
latest revision of the patch?

https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00976.html

Martin
Pedro Alves May 16, 2017, 10:48 p.m. UTC | #10
On 05/16/2017 08:41 PM, Jason Merrill wrote:

> I agree that it makes sense to
> check for a trivial assignment operator specifically.  I guess we want
> a slightly stronger "trivially copyable" that also requires a
> non-deleted assignment operator.
> 
> It seems to me that the relevant tests are:
> 
> bcopy/memcpy/memmove want trivally copyable + non-deleted assignment.
> bzero/memset want trivial + non-deleted assignment.
> 
> I'm still not convinced we need to consider standard-layout at all.

What do you think of warning for memset of types with references?  Since NULL
references are undefined behavior (N4659, [dcl.ref]/5), IMHO such code is quite
smelly and most likely a sign of incomplete refactoring, not design.

(My original intention for poisoning memset of standard-layout type in gdb was
mainly as proxy/approximation for "type with references".  Other properties
that make a type non-standard-layout are not as interesting to me.)

While at it, maybe the same reasoning would justify warn of memcpy/memset
of types with const data members?  memcpy of such types kind of sounds like a
recipe for subtle breakage that could only be salvaged with std::launder.
And if you know about that, you'll probably be copying objects of type T
to/from raw byte/char storage, not to/from another T.

Actually, now that I look at Martin's new patch, I see he's already warning
about const data members and references, both memcpy and memset.  I'm not
super convinced on warning about memcpy of references (unlike memset), but
I don't feel strongly against it either.  I'd be fine (FWIW) with giving it a
try and see what breaks out there.

> +Wnon-trivial-memaccess
> +C++ ObjC++ Var(warn_nontrival_memaccess) Warning LangEnabledBy(C++ ObjC++, Wall)
> +Warn for raw memory writes to objects of non-trivial types.

May I suggest renaming the warning (and the description above) from
-Wnon-trivial-memaccess to something else that avoids "trivial" in
its name?  Because it's no longer strictly about "trivial" in the
standard C++ sense.  The documentation talks about "The safe way",
and "does not warn on safe calls", so maybe call it -Wunsafe-memaccess?

(I spotted a couple typos in the new patch: "otherwse", "becase", btw.)

Thanks,
Pedro Alves
Martin Sebor May 17, 2017, 1:55 a.m. UTC | #11
On 05/16/2017 04:48 PM, Pedro Alves wrote:
> On 05/16/2017 08:41 PM, Jason Merrill wrote:
>
>> I agree that it makes sense to
>> check for a trivial assignment operator specifically.  I guess we want
>> a slightly stronger "trivially copyable" that also requires a
>> non-deleted assignment operator.
>>
>> It seems to me that the relevant tests are:
>>
>> bcopy/memcpy/memmove want trivally copyable + non-deleted assignment.
>> bzero/memset want trivial + non-deleted assignment.
>>
>> I'm still not convinced we need to consider standard-layout at all.
>
> What do you think of warning for memset of types with references?  Since NULL
> references are undefined behavior (N4659, [dcl.ref]/5), IMHO such code is quite
> smelly and most likely a sign of incomplete refactoring, not design.
>
> (My original intention for poisoning memset of standard-layout type in gdb was
> mainly as proxy/approximation for "type with references".  Other properties
> that make a type non-standard-layout are not as interesting to me.)
>
> While at it, maybe the same reasoning would justify warn of memcpy/memset
> of types with const data members?  memcpy of such types kind of sounds like a
> recipe for subtle breakage that could only be salvaged with std::launder.
> And if you know about that, you'll probably be copying objects of type T
> to/from raw byte/char storage, not to/from another T.
>
> Actually, now that I look at Martin's new patch, I see he's already warning
> about const data members and references, both memcpy and memset.  I'm not
> super convinced on warning about memcpy of references (unlike memset), but
> I don't feel strongly against it either.  I'd be fine (FWIW) with giving it a
> try and see what breaks out there.

I did this because objects with references cannot be assigned
to (the default copy assignment is deleted).  So as a baseline
rule, I made the warning trigger whenever a native assignment
or copy isn't valid.  In the IMO unlikely event that a memcpy
over a reference is intended, the warning is easy to suppress.

I expect calling memset on an object with a reference to almost
certainly be a bug since there's no way to make such a reference
usable.  The only time it might be intentional is when someone
tries to wipe out the storage before deleting the object in
the storage (e.g., in a dtor).  But that's a misuse as well
because such calls are typically eliminated, much to many
a security analyst's shock and horror.  I'd like to eventually
diagnose that too (though possibly under a different warning).

I used a similar (though not exactly the same) rationale for
warning for const members.  They too cannot be assigned to,
and letting memset or memcpy silently change them violates
const-correctnes.  It's also undefined and the immutability
of such members an optimization opportunity waiting to be
exploited.

>> +Wnon-trivial-memaccess
>> +C++ ObjC++ Var(warn_nontrival_memaccess) Warning LangEnabledBy(C++ ObjC++, Wall)
>> +Warn for raw memory writes to objects of non-trivial types.
>
> May I suggest renaming the warning (and the description above) from
> -Wnon-trivial-memaccess to something else that avoids "trivial" in
> its name?  Because it's no longer strictly about "trivial" in the
> standard C++ sense.  The documentation talks about "The safe way",
> and "does not warn on safe calls", so maybe call it -Wunsafe-memaccess?

I debated whether to rename things (not just the warning but
also the function that implements it in GCC).  Ultimately I
decided it wasn't necessary because the rules seem close enough
to be based on some notion of triviality and because no better
name came to mind. -Wunsafe-memaccess might work.  The one mild
concern I have with it is that it could suggest it might do more
than simple type checking (e.g., buffer overflow and what not).
Let's see what Jason thinks.

> (I spotted a couple typos in the new patch: "otherwse", "becase", btw.)

I'm a horrible typist.  I'll proofread the patch again and fix
them up before committing it.

Martin
Pedro Alves May 17, 2017, 11:13 a.m. UTC | #12
On 05/17/2017 02:55 AM, Martin Sebor wrote:
> On 05/16/2017 04:48 PM, Pedro Alves wrote:
>> On 05/16/2017 08:41 PM, Jason Merrill wrote:
>>
>>> I agree that it makes sense to
>>> check for a trivial assignment operator specifically.  I guess we want
>>> a slightly stronger "trivially copyable" that also requires a
>>> non-deleted assignment operator.
>>>
>>> It seems to me that the relevant tests are:
>>>
>>> bcopy/memcpy/memmove want trivally copyable + non-deleted assignment.
>>> bzero/memset want trivial + non-deleted assignment.
>>>
>>> I'm still not convinced we need to consider standard-layout at all.
>>
>> What do you think of warning for memset of types with references? 

Having slept, I now realize you had that covered already by the
"non-deleted assignment" requirement...  A reference data member
makes the assignment operator be implicitly deleted.  Sorry for the noise.

>> While at it, maybe the same reasoning would justify warn of memcpy/memset
>> of types with const data members?  

Ditto.

> I did this because objects with references cannot be assigned
> to (the default copy assignment is deleted).  So as a baseline
> rule, I made the warning trigger whenever a native assignment
> or copy isn't valid.  In the IMO unlikely event that a memcpy
> over a reference is intended, the warning is easy to suppress.

Agreed.

I wondered whether we'll end up wanting to distinguish these cases:

#1    memcpy (T *, const T *src, size_t n);
#2.1  memcpy (T *, const char *src, size_t n);  // char, void, std::byte...
#2.2  memcpy (char *, const T *src, size_t n);  // char, void, std::byte...
#3    memcpy (T *, const U *src, size_t n);

Where:
- T is a trivially copyable type that still triggers the new warning.
- U is a type unrelated to T, and is not (unsigned) char, void or std::byte.

#1 is the case that looks like copy.

#2.1 and 2.2 may well appear in type-erasing code.

#3 would look like a way to work around aliasing issues and (even though
ISTR that it's OK as GNU extension if T and U are trivial) worthy of a
warning even if T is trivial under the definition of the warning.
Reading your updated patch, I see that you warn already when T is trivial
and U is not trivial, but IIUC, not if U is also trivial, even if
unrelated to T.  Anyway, I don't really want to argue about this -- I
started writing this paragraph before actually reading the patch, and
then actually read the patch and was pleasantly surprised with what
I saw.   I think it's looking great.

> I used a similar (though not exactly the same) rationale for
> warning for const members.  They too cannot be assigned to,
> and letting memset or memcpy silently change them violates
> const-correctnes. 

*Nod*

> It's also undefined 

I'm not sure, I think there may be nuances, as usual.  AFAICS, it's generally
valid to memcpy into trivially copyable types that have const members to/from
untyped/char storage, AFAICS.  Might need to apply std::launder afterwards.  But it
isn't clear to me, since whether memcpy starts a (trivial) object's lifetime is
up in the air, AFAIK.  But I'm not suggesting to really consider that.  The rare
specialized code will be able to work around the spurious warning.

> and the immutability
> of such members an optimization opportunity waiting to be
> exploited.
> 

*nod*

>>> +Wnon-trivial-memaccess
>>> +C++ ObjC++ Var(warn_nontrival_memaccess) Warning LangEnabledBy(C++
>>> ObjC++, Wall)
>>> +Warn for raw memory writes to objects of non-trivial types.
>>
>> May I suggest renaming the warning (and the description above) from
>> -Wnon-trivial-memaccess to something else that avoids "trivial" in
>> its name?  Because it's no longer strictly about "trivial" in the
>> standard C++ sense.  The documentation talks about "The safe way",
>> and "does not warn on safe calls", so maybe call it -Wunsafe-memaccess?
> 
> I debated whether to rename things (not just the warning but
> also the function that implements it in GCC).  Ultimately I
> decided it wasn't necessary because the rules seem close enough
> to be based on some notion of triviality and because no better
> name came to mind. -Wunsafe-memaccess might work.  The one mild
> concern I have with it is that it could suggest it might do more
> than simple type checking (e.g., buffer overflow and what not).
> Let's see what Jason thinks.

Yet another motivation of avoiding "trivial" that crossed my mind is that
you may want to enable the warning in C too, e.g., for warning about
the memcpy of types with const members.  But C as no concept
of "trivial", everything is "trivial".

>> (I spotted a couple typos in the new patch: "otherwse", "becase", btw.)
> 
> I'm a horrible typist.  I'll proofread the patch again and fix
> them up before committing it.

Thanks much for working on this.  I think this will uncover lots of
latent bugs in many codebases.  Looking forward to have this in.

Thanks,
Pedro Alves
Jason Merrill May 19, 2017, 7:07 p.m. UTC | #13
On Tue, May 16, 2017 at 5:39 PM, Martin Sebor <msebor@gmail.com> wrote:
> On 05/16/2017 01:41 PM, Jason Merrill wrote:
>
>> I'm still not convinced we need to consider standard-layout at all.
>
> I agree.  The patch doesn't make use of is_standard_layout_p().
> It defines its own helper called almost_std_layout_p() that
> combines trivial_type_p() with tests for non-public members,

That's the part that seems unnecessary.  Why do we care about
non-public members?

Jason
Martin Sebor May 19, 2017, 8:07 p.m. UTC | #14
On 05/19/2017 01:07 PM, Jason Merrill wrote:
> On Tue, May 16, 2017 at 5:39 PM, Martin Sebor <msebor@gmail.com> wrote:
>> On 05/16/2017 01:41 PM, Jason Merrill wrote:
>>
>>> I'm still not convinced we need to consider standard-layout at all.
>>
>> I agree.  The patch doesn't make use of is_standard_layout_p().
>> It defines its own helper called almost_std_layout_p() that
>> combines trivial_type_p() with tests for non-public members,
>
> That's the part that seems unnecessary.  Why do we care about
> non-public members?

Because modifying them breaks encapsulation.

If I take a legacy struct, make some of its members private,
and define accessors and modifiers to manipulate those members
and maintain invariants between them, I will want to check and
adjust all code that changes objects of the struct in ways that
might violate the invariants.

What common use case are you concerned about that isn't more
appropriately expressed using the generated default or copy
constructor or assignment operator?

Martin
Jason Merrill May 19, 2017, 9:42 p.m. UTC | #15
On Fri, May 19, 2017 at 4:07 PM, Martin Sebor <msebor@gmail.com> wrote:
> On 05/19/2017 01:07 PM, Jason Merrill wrote:
>>
>> On Tue, May 16, 2017 at 5:39 PM, Martin Sebor <msebor@gmail.com> wrote:
>>>
>>> On 05/16/2017 01:41 PM, Jason Merrill wrote:
>>>
>>>> I'm still not convinced we need to consider standard-layout at all.
>>>
>>>
>>> I agree.  The patch doesn't make use of is_standard_layout_p().
>>> It defines its own helper called almost_std_layout_p() that
>>> combines trivial_type_p() with tests for non-public members,
>>
>>
>> That's the part that seems unnecessary.  Why do we care about
>> non-public members?
>
> Because modifying them breaks encapsulation.

Not if you're clearing/copying the object as a whole.

> If I take a legacy struct, make some of its members private,
> and define accessors and modifiers to manipulate those members
> and maintain invariants between them, I will want to check and
> adjust all code that changes objects of the struct in ways that
> might violate the invariants.

For a trivial type, worrying about invariants doesn't make sense to
me, since default-initialization won't establish any invariants.  And
bzero or memset(0) will have the same effect as value-initialization
(if zero_init_p (type); we probably want to check that).  If you're
going to establish invariants, I would expect you to write a default
constructor, which would make the class non-trivial.

> What common use case are you concerned about that isn't more
> appropriately expressed using the generated default or copy
> constructor or assignment operator?

None; I am concerned about focusing the warning on code that is
actually likely to be problematic.

Jason
Martin Sebor May 21, 2017, 11:59 p.m. UTC | #16
On 05/19/2017 03:42 PM, Jason Merrill wrote:
> On Fri, May 19, 2017 at 4:07 PM, Martin Sebor <msebor@gmail.com> wrote:
>> On 05/19/2017 01:07 PM, Jason Merrill wrote:
>>>
>>> On Tue, May 16, 2017 at 5:39 PM, Martin Sebor <msebor@gmail.com> wrote:
>>>>
>>>> On 05/16/2017 01:41 PM, Jason Merrill wrote:
>>>>
>>>>> I'm still not convinced we need to consider standard-layout at all.
>>>>
>>>>
>>>> I agree.  The patch doesn't make use of is_standard_layout_p().
>>>> It defines its own helper called almost_std_layout_p() that
>>>> combines trivial_type_p() with tests for non-public members,
>>>
>>>
>>> That's the part that seems unnecessary.  Why do we care about
>>> non-public members?
>>
>> Because modifying them breaks encapsulation.
>
> Not if you're clearing/copying the object as a whole.
>
>> If I take a legacy struct, make some of its members private,
>> and define accessors and modifiers to manipulate those members
>> and maintain invariants between them, I will want to check and
>> adjust all code that changes objects of the struct in ways that
>> might violate the invariants.
>
> For a trivial type, worrying about invariants doesn't make sense to
> me, since default-initialization won't establish any invariants.  And
> bzero or memset(0) will have the same effect as value-initialization
> (if zero_init_p (type); we probably want to check that).  If you're
> going to establish invariants, I would expect you to write a default
> constructor, which would make the class non-trivial.

Thanks for the zero_init_p pointer!  Let me add that to the patch
along with Pedro's suggestion to use the current C++ terminology,
retest and resubmit.

In most cases you're right that defining the default constructor
is the way to go.  There is are a couple of use cases where a ctor
tends to be avoided: when objects the class need to be initialized
statically, and where they need to be PODs.  GCC itself relies on
the latter (e.g., some of the vec templates), apparently because
it stores them in unions.  It doesn't look tome like these vec
templates maintain much of an invariant of any kind, but they
easily could.

An example of the static initialization case is an atomic class
(not necessarily std::atomic though I think this would apply to
it as well).  Its objects usually need to be statically default-
initializable (without running any ctors) but then they must be
explicitly initialized by some sort of an init call to be made
valid, and their state can only be accessed and modified via
member functions (and so their state is private).  Modifying
them by any other means, including by memset or memcpy, is
undefined.

>
>> What common use case are you concerned about that isn't more
>> appropriately expressed using the generated default or copy
>> constructor or assignment operator?
>
> None; I am concerned about focusing the warning on code that is
> actually likely to be problematic.

Hopefully the above explanation resolves that concern.  If not,
please let me know.

Martin
Jason Merrill May 22, 2017, 4:50 a.m. UTC | #17
On Sun, May 21, 2017 at 7:59 PM, Martin Sebor <msebor@gmail.com> wrote:
> On 05/19/2017 03:42 PM, Jason Merrill wrote:
>> On Fri, May 19, 2017 at 4:07 PM, Martin Sebor <msebor@gmail.com> wrote:
>>> On 05/19/2017 01:07 PM, Jason Merrill wrote:
>>>> On Tue, May 16, 2017 at 5:39 PM, Martin Sebor <msebor@gmail.com> wrote:
>>>>> On 05/16/2017 01:41 PM, Jason Merrill wrote:
>>>>>
>>>>>> I'm still not convinced we need to consider standard-layout at all.
>>>>>
>>>>> I agree.  The patch doesn't make use of is_standard_layout_p().
>>>>> It defines its own helper called almost_std_layout_p() that
>>>>> combines trivial_type_p() with tests for non-public members,
>>>>
>>>>
>>>>
>>>> That's the part that seems unnecessary.  Why do we care about
>>>> non-public members?
>>>
>>>
>>> Because modifying them breaks encapsulation.
>>
>>
>> Not if you're clearing/copying the object as a whole.
>>
>>> If I take a legacy struct, make some of its members private,
>>> and define accessors and modifiers to manipulate those members
>>> and maintain invariants between them, I will want to check and
>>> adjust all code that changes objects of the struct in ways that
>>> might violate the invariants.
>>
>> For a trivial type, worrying about invariants doesn't make sense to
>> me, since default-initialization won't establish any invariants.  And
>> bzero or memset(0) will have the same effect as value-initialization
>> (if zero_init_p (type); we probably want to check that).  If you're
>> going to establish invariants, I would expect you to write a default
>> constructor, which would make the class non-trivial.
>
> Thanks for the zero_init_p pointer!  Let me add that to the patch
> along with Pedro's suggestion to use the current C++ terminology,
> retest and resubmit.
>
> In most cases you're right that defining the default constructor
> is the way to go.  There is are a couple of use cases where a ctor
> tends to be avoided: when objects the class need to be initialized
> statically, and where they need to be PODs.  GCC itself relies on
> the latter (e.g., some of the vec templates), apparently because
> it stores them in unions.  It doesn't look tome like these vec
> templates maintain much of an invariant of any kind, but they
> easily could.
>
> An example of the static initialization case is an atomic class
> (not necessarily std::atomic though I think this would apply to
> it as well).  Its objects usually need to be statically default-
> initializable (without running any ctors) but then they must be
> explicitly initialized by some sort of an init call to be made
> valid, and their state can only be accessed and modified via
> member functions (and so their state is private).  Modifying
> them by any other means, including by memset or memcpy, is
> undefined.
>
>>
>>> What common use case are you concerned about that isn't more
>>> appropriately expressed using the generated default or copy
>>> constructor or assignment operator?
>>
>>
>> None; I am concerned about focusing the warning on code that is
>> actually likely to be problematic.
>
>
> Hopefully the above explanation resolves that concern.  If not,
> please let me know.

I still think that we shouldn't warn about zeroing out a
non-standard-layout object when value-initialization would produce the
exact same result.  If you want to keep the almost_std_layout_p check
for the non-zero memset case, that's fine.

Jason
Andrew Pinski July 5, 2017, 8:58 p.m. UTC | #18
On Sun, Apr 30, 2017 at 1:02 PM, Pedro Alves <palves@redhat.com> wrote:
> Hi Martin,
>
> Thanks much for doing this.  A few comments below, in light of my
> experience doing the equivalent checks in the gdb patch linked below,
> using standard C++11.
>
> On 04/29/2017 09:09 PM, Martin Sebor wrote:
>> Calling memset, memcpy, or similar to write to an object of
>> a non-trivial type (such as one that defines a ctor or dtor,
>> or has such a member) can break the invariants otherwise
>> maintained by the class and cause undefined behavior.
>>
>> The motivating example that prompted this work was a review of
>> a change that added to a plain old struct a new member with a ctor
>> and dtor (in this instance the member was of type std::vector).
>>
>> To help catch problems of this sort some projects (such as GDB)
>> have apparently even devised their own clever solutions to detect
>> them: https://sourceware.org/ml/gdb-patches/2017-04/msg00378.html.
>>
>> The attached patch adds a new warning, -Wnon-trivial-memaccess,
>> that has GCC detect these mistakes.  The patch also fixes up
>> a handful of instances of the problem in GCC.  These instances
>> correspond to the two patterns below:
>>
>>   struct A
>>   {
>>     void *p;
>>     void foo (int n) { p = malloc (n); }
>>     ~A () { free (p); }
>>   };
>>
>>   void init (A *a)
>>   {
>>     memset (a, 0, sizeof *a);
>>   }
>>
>> and
>>
>>   struct B
>>   {
>>     int i;
>>     ~A ();
>>   };
>
> (typo: "~B ();")
>
>>
>>   void copy (B *p, const B *q)
>>   {
>>     memcpy (p, q, sizeof *p);
>>     ...
>>    }
>>
>
> IMO the check should be relaxed from "type is trivial" to "type is
> trivially copyable" (which is what the gdb detection at
> https://sourceware.org/ml/gdb-patches/2017-04/msg00378.html
> uses for memcpy/memmove).  Checking that the destination is trivial is
> going to generate false positives -- specifically, [basic-types]/3
> specifies that it's fine to memcpy trivially _copyable_ types, not
> trivial types.  A type can be both non-trivial and trivially copyable
> at the same time.  For example, this compiles, but triggers
> your new warning:
>
> #include <stdlib.h>
> #include <string.h>
> #include <type_traits>
>
> struct NonTrivialButTriviallyCopyable
> {
>   NonTrivialButTriviallyCopyable () : i (0) {}
>   int i;
> };
>
> static_assert (!std::is_trivial<NonTrivialButTriviallyCopyable>::value, "");
> static_assert (std::is_trivially_copyable<NonTrivialButTriviallyCopyable>::value, "");
>
> void copy (NonTrivialButTriviallyCopyable *dst, NonTrivialButTriviallyCopyable *src)
> {
>   memcpy (dst, src, sizeof (*src));
> }
>
> $ /opt/gcc/bin/g++ -std=gnu++11 trivial-warn.cc -o trivial-warn -g3 -O0 -Wall -Wextra -c
> trivial-warn.cc: In function ‘void copy(NonTrivialButTriviallyCopyable*, NonTrivialButTriviallyCopyable*)’:
> trivial-warn.cc:16:34: warning: calling ‘void* memcpy(void*, const void*, size_t)’ with a pointer to a non-trivial type ‘struct NonTrivialButTriviallyCopyable’ [-Wnon-trivial-memaccess]
>    memcpy (dst, src, sizeof (*src));
>                                   ^
> $
>
> Implementations of vector-like classes can very well (and are
> encouraged) to make use of std::is_trivially_copyable to know whether
> they can copy a range of elements to new storage
> using memcpy/memmove/mempcpy.
>
> Running your patch against GDB trips on such a case:
>
> src/gdb/btrace.h: In function ‘btrace_insn_s* VEC_btrace_insn_s_quick_insert(VEC_btrace_insn_s*, unsigned int, const btrace_insn_s*, const char*, unsigned int)’:
> src/gdb/common/vec.h:948:62: error: calling ‘void* memmove(void*, const void*, size_t)’ with a pointer to a non-trivial type ‘btrace_insn_s {aka struct btrace_insn}’ [-Werror=non-trivial-memaccess]
>    memmove (slot_ + 1, slot_, (vec_->num++ - ix_) * sizeof (T));    \
>                                                               ^
>
> There is nothing wrong with the code being warned here.
> While "struct btrace_insn" is trivial (has a user-provided default
> ctor), it is still trivially copyable.


Any news on getting a "fix" for this issue.  Right now it blocks my
testing of GCC/gdb because I am building the trunk of both in a CI
loop and my build is broken due to this warning.  Should I just add
--disable-werror to my gdb build instead?

Thanks,
Andrew Pinski

>
> Now, this gdb code is using the old VEC (originated from
> gcc's C days, it's not the current C++fied VEC implementation),
> but the point is that any other random vector-like container out there
> is free to optimize copy of a range of non-trivial but trivially
> copyable types using memcpy/memmove.
>
> Note that libstdc++ does not actually do that optimization, but
> that's just a missed optimization, see PR libstdc++/68350 [1]
> "std::uninitialized_copy overly restrictive for
> trivially_copyable types".  (libstdc++'s std::vector defers
> copy to std::unitialized_copy.)
>
> [1] - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68350
>
>> These aren't undefined and the patch could be tweaked to allow
>> them.
>
> I think they're undefined because the types are not trivially
> copyable (non-trivial destructor with side effects).
>
>> I decided not to invest effort into it because, although
>> not strictly erroneous, I think they represent poor practice.
>
> I think that for my suggestion, all you really need is use
> call trivially_copyable_p instead of trivial_type_p, for
> memcpy/memmove/mempcpy at least.
>
> For memset, I'd suggest to go the other direction actually, and
> instead of relaxing the trivial check, to tighten it, by warning about
> memset'ting objects of non-standard-layout type too.  I.e., warn for
> memset of all non-POD (non-trivial + non-standard-layout) types.  That's
> what I did in the gdb patch.  That would also produce warnings for memset
> of trivial types that via refactoring end up with references, like:
>
> struct Trivial
> {
>   Trivial () = default;
>   Trivial (int &i) : m_i (i) {}
>   void add (int howmuch) { m_i += howmuch; }
>
> private:
>   int &m_i;
> };
>
> void reset (Trivial *triv)
> {
>   memset (triv, 0, sizeof (Trivial));
> }
>
> void recompute (Trivial *triv)
> {
>   reset (triv); // start over
>   triv->add (10); // whoops, null reference.
> }
>
> It's also warn for memset of trivial types that aren't standard
> layout due to having a mix of public/protected/private fields,
> which is likely not a real problem in practice, but I'd call
> those a code smell that warrants a warning too:
>
>  struct S
>  {
>  private:
>    int a;
>  public:
>    int b;
>  };
>
>  S s;
>  memset (&s, 0, sizeof (S));
>
>
> Playing with the patch, I noticed that you can't silence
> it by casting the pointer to void, but you can with
> casting to char, like:
>
> void copy (B *dst, B *src)
> {
>     memcpy (dst, src, sizeof (*src)); // warns
>     memcpy ((void*) dst, (void *) src, sizeof (*src)); // still warns
>     memcpy ((char*) dst, (char *) src, sizeof (*src)); // doesn't warn
>     memcpy ((unsigned char*) dst, (unsigned char *) src, sizeof (*src)); // doesn't warn
> }
>
> I can understand how we end up like that, given char's magic properties, but still
> I think many will reach for "void" first if they (really really) need to add a cast to
> silence the warning.  In any case, I think it'd be very nice to add cases with such
> casts to the new tests, and maybe mention it in the documentation too.
>
> Thanks,
> Pedro Alves
>
diff mbox

Patch

PR c++/80560 - warn on undefined memory operations involving non-trivial types

gcc/c-family/ChangeLog:

	PR c++/80560
	* c.opt (-Wnon-trivial-memaccess): New option.

gcc/cp/ChangeLog:

	PR c++/80560
	* call.c (maybe_warn_nontrivial_memaccess): New function.
	(build_cxx_call): Call it.

gcc/ChangeLog:

	PR c++/80560
	* doc/invoke.texi (Wnon-trivial-memaccess): Document new option.

libitm/ChangeLog:

	PR c++/80560
	* beginend.cc (GTM::gtm_thread::rollback): Avoid calling memset
	on an object of a non-trivial type.

libcpp/ChangeLog:

	PR c++/80560
	* line-map.c (line_maps::~line_maps): Avoid calling htab_delete
	with a null pointer.
	(linemap_init): Avoid calling memset on an object of a non-trivial
	type.

gcc/testsuite/ChangeLog:

	PR c++/80560
	* g++.dg/Wnon-trivial-memaccess.C: New test.

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 9ad2f6e..dae5e80 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -792,6 +792,10 @@  Wnon-template-friend
 C++ ObjC++ Var(warn_nontemplate_friend) Init(1) Warning
 Warn when non-templatized friend functions are declared within a template.
 
+Wnon-trivial-memaccess
+C++ ObjC++ Var(warn_nontrival_memaccess) Warning LangEnabledBy(C++ ObjC++, Wall)
+Warn for raw memory writes to objects of non-trivial types.
+
 Wnon-virtual-dtor
 C++ ObjC++ Var(warn_nonvdtor) Warning LangEnabledBy(C++ ObjC++,Weffc++)
 Warn about non-virtual destructors.
diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index c15b8e4..8655b53 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -8152,6 +8152,43 @@  build_over_call (struct z_candidate *cand, int flags, tsubst_flags_t complain)
   return call;
 }
 
+/* Issue a warning on a call to the built-in function FNDECL if it is
+   a memory write whose destination is an object of a non-trivial type.  */
+
+static void
+maybe_warn_nontrivial_memaccess (location_t loc, tree fndecl, tree dest)
+{
+  /* Warn about raw memory operations whose destination is an object
+     of a non-trivial type because they are undefined.  */
+  bool memfunc = false;
+  switch (DECL_FUNCTION_CODE (fndecl))
+    {
+    case BUILT_IN_BZERO:
+    case BUILT_IN_MEMCPY:
+    case BUILT_IN_MEMMOVE:
+    case BUILT_IN_MEMPCPY:
+    case BUILT_IN_MEMSET:
+      memfunc = true;
+      break;
+
+    default:
+      break;
+    }
+
+  if (memfunc)
+    {
+      if (TREE_CODE (dest) == NOP_EXPR)
+	dest = TREE_OPERAND (dest, 0);
+
+      tree desttype = TREE_TYPE (TREE_TYPE (dest));
+
+      if (COMPLETE_TYPE_P (desttype) && !trivial_type_p (desttype))
+	warning_at (loc, OPT_Wnon_trivial_memaccess,
+		    "calling %qD with a pointer to a non-trivial type %#qT",
+		    fndecl, desttype);
+    }
+}
+
 /* Build and return a call to FN, using NARGS arguments in ARGARRAY.
    This function performs no overload resolution, conversion, or other
    high-level operations.  */
@@ -8184,6 +8221,9 @@  build_cxx_call (tree fn, int nargs, tree *argarray,
       if (!check_builtin_function_arguments (EXPR_LOCATION (fn), vNULL, fndecl,
 					     nargs, argarray))
 	return error_mark_node;
+
+      /* Warn if the built-in writes to an object of a non-trivial type.  */
+      maybe_warn_nontrivial_memaccess (loc, fndecl, argarray[0]);
     }
 
     /* If it is a built-in array notation function, then the return type of
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 0eeea7b..e1d01a9 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -215,7 +215,8 @@  in the following sections.
 -Wabi=@var{n}  -Wabi-tag  -Wconversion-null  -Wctor-dtor-privacy @gol
 -Wdelete-non-virtual-dtor  -Wliteral-suffix  -Wmultiple-inheritance @gol
 -Wnamespaces  -Wnarrowing @gol
--Wnoexcept  -Wnoexcept-type  -Wnon-virtual-dtor  -Wreorder  -Wregister @gol
+-Wnoexcept  -Wnoexcept-type  -Wnon-trivial-memaccess @gol
+-Wnon-virtual-dtor  -Wreorder  -Wregister @gol
 -Weffc++  -Wstrict-null-sentinel  -Wtemplates @gol
 -Wno-non-template-friend  -Wold-style-cast @gol
 -Woverloaded-virtual  -Wno-pmf-conversions @gol
@@ -2911,6 +2912,21 @@  void g() noexcept;
 void h() @{ f(g); @} // in C++14 calls f<void(*)()>, in C++1z calls f<void(*)()noexcept>
 @end smallexample
 
+@item -Wnon-trivial-memaccess @r{(C++ and Objective-C++ only)}
+@opindex Wnon-trivial-memaccess
+Warn when the destination of a call to a raw memory function such as
+@code{memset} or @code{memcpy} is an object of a non-trivial class type.
+Modifying the representation of such an object may violate invariants
+maintained by member functions of the class.
+For example, the call to @code{memset} below is undefined becase it
+modifies a non-trivial class object and is, therefore, diagnosed.
+The safe way to either initialize or "reset" objects of non-trivial
+types is by using the appropriate constructor.
+@smallexample
+std::string str = "abc";
+memset (&str, 0, 3);
+@end smallexample
+The @option{-Wnon-trivial-memaccess} option is enabled by @option{-Wall}.
 
 @item -Wnon-virtual-dtor @r{(C++ and Objective-C++ only)}
 @opindex Wnon-virtual-dtor
diff --git a/gcc/testsuite/g++.dg/Wnon-trivial-memaccess.C b/gcc/testsuite/g++.dg/Wnon-trivial-memaccess.C
new file mode 100644
index 0000000..812a9eb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/Wnon-trivial-memaccess.C
@@ -0,0 +1,122 @@ 
+/* PR c++/80560 - warn on undefined memory operations involving non-trivial
+   types
+   { dg-do compile }
+   { dg-options "-Wnon-trivial-memaccess" } */
+
+struct Trivial { int i; char *s; char a[4]; };
+struct HasDefaultCtor { HasDefaultCtor (); };
+struct HasCopyCtor { HasCopyCtor (); };
+struct HasDtor { HasDtor (); };
+struct HasAssign { void operator= (HasAssign&); };
+
+typedef __SIZE_TYPE__ size_t;
+
+extern "C" {
+  void bzero (void*, size_t);
+  void* memcpy (void*, const void*, size_t);
+  void* memmove (void*, const void*, size_t);
+  void* mempcpy (void*, const void*, size_t);
+  void* memset (void*, int, size_t);
+}
+
+void sink (void*);
+
+#define T(fn, arglist) (fn arglist, sink (p))
+
+void test (Trivial *p, void *q)
+{
+  T (bzero, (p, sizeof *p));
+  T (bzero, (q, sizeof *p));
+
+  T (memcpy, (p, q, sizeof *p));
+  T (memcpy, (q, p, sizeof *p));
+
+  T (memset, (p, 0, sizeof *p));
+  T (memset, (q, 0, sizeof *p));
+
+  T (memmove, (p, q, sizeof *p));
+  T (memmove, (q, p, sizeof *p));
+}
+
+void test (HasDefaultCtor *p, const void *q)
+{
+  T (bzero, (p, sizeof *p));   // { dg-warning "calling .void bzero(\[^\n\r\]*). with a pointer to a non-trivial type 'struct HasDefaultCtor." }
+
+  T (memcpy, (p, q, sizeof *p));  // { dg-warning "calling .void\\* memcpy" }
+
+  T (memset, (p, 0, sizeof *p));  // { dg-warning "calling .void\\* memset" }
+
+  T (memmove, (p, q, sizeof *p)); // { dg-warning "calling .void\\* memmove" }
+
+  T (mempcpy, (p, q, sizeof *p)); // { dg-warning "calling .void\\* mempcpy" }
+}
+
+void test (HasCopyCtor *p, const void *q)
+{
+  T (bzero, (p, sizeof *p));      // { dg-warning "calling .void bzero" }
+
+  T (memcpy, (p, q, sizeof *p));  // { dg-warning "calling .void\\* memcpy" }
+
+  T (memset, (p, 0, sizeof *p));  // { dg-warning "calling .void\\* memset" }
+
+  T (memmove, (p, q, sizeof *p)); // { dg-warning "calling .void\\* memmove" }
+
+  T (mempcpy, (p, q, sizeof *p)); // { dg-warning "calling .void\\* mempcpy" }
+}
+
+void test (HasDtor *p, const void *q)
+{
+  T (bzero, (p, sizeof *p));      // { dg-warning "calling .void bzero" }
+
+  T (memcpy, (p, q, sizeof *p));  // { dg-warning "calling .void\\* memcpy" }
+
+  T (memset, (p, 0, sizeof *p));  // { dg-warning "calling .void\\* memset" }
+
+  T (memmove, (p, q, sizeof *p)); // { dg-warning "calling .void\\* memmove" }
+
+  T (mempcpy, (p, q, sizeof *p)); // { dg-warning "calling .void\\* mempcpy" }
+}
+
+void test (HasAssign *p, const void *q)
+{
+  T (bzero, (p, sizeof *p));     // { dg-warning "calling .void bzero" }
+
+  T (memcpy, (p, q, sizeof *p));  // { dg-warning "calling .void\\* memcpy" }
+
+  T (memset, (p, 0, sizeof *p));  // { dg-warning "calling .void\\* memset" }
+
+  T (memmove, (p, q, sizeof *p)); // { dg-warning "calling .void\\* memmove" }
+
+  T (mempcpy, (p, q, sizeof *p)); // { dg-warning "calling .void\\* mempcpy" }
+}
+
+
+void test_expr (int i)
+{
+  struct TestClass: HasDefaultCtor { };
+  TestClass a, b;
+
+  static void *p;
+
+  T (bzero, (i < 0 ? &a : &b, 1));  // { dg-warning "calling .void bzero" }
+}
+
+void test_this ()
+{
+#undef T
+#define T(fn, arglist) (fn arglist, sink (this))
+
+  static const void *p;
+
+  struct TestDefaultCtor: HasDefaultCtor
+  {
+    TestDefaultCtor ()
+    {
+      T (bzero, (this, sizeof *this)); // { dg-warning "calling .void bzero" }
+      T (memset, (this, 0, sizeof *this)); // { dg-warning "calling .void\\* memset" }
+      T (memcpy, (this, p, sizeof *this)); // { dg-warning "calling .void\\* memcpy" }
+      T (memmove, (this, p, sizeof *this)); // { dg-warning "calling .void\\* memmove" }
+      T (mempcpy, (this, p, sizeof *this)); // { dg-warning "calling .void\\* mempcpy" }
+    }
+  };
+}
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index 949489e..4e36e38 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -62,7 +62,8 @@  extern unsigned num_macro_tokens_counter;
 
 line_maps::~line_maps ()
 {
-  htab_delete (location_adhoc_data_map.htab);
+  if (location_adhoc_data_map.htab)
+    htab_delete (location_adhoc_data_map.htab);
 }
 
 /* Hash function for location_adhoc_data hashtable.  */
@@ -347,7 +348,7 @@  void
 linemap_init (struct line_maps *set,
 	      source_location builtin_location)
 {
-  memset (set, 0, sizeof (struct line_maps));
+  *set = line_maps ();
   set->highest_location = RESERVED_LOCATION_COUNT - 1;
   set->highest_line = RESERVED_LOCATION_COUNT - 1;
   set->location_adhoc_data_map.htab =
diff --git a/libitm/beginend.cc b/libitm/beginend.cc
index d04f3e9..c6550a3 100644
--- a/libitm/beginend.cc
+++ b/libitm/beginend.cc
@@ -431,7 +431,7 @@  GTM::gtm_transaction_cp::save(gtm_thread* tx)
   // Save everything that we might have to restore on restarts or aborts.
   jb = tx->jb;
   undolog_size = tx->undolog.size();
-  memcpy(&alloc_actions, &tx->alloc_actions, sizeof(alloc_actions));
+  alloc_actions = tx->alloc_actions;
   user_actions_size = tx->user_actions.size();
   id = tx->id;
   prop = tx->prop;
@@ -449,7 +449,7 @@  GTM::gtm_transaction_cp::commit(gtm_thread* tx)
   // commits of nested transactions. Allocation actions must be committed
   // before committing the snapshot.
   tx->jb = jb;
-  memcpy(&tx->alloc_actions, &alloc_actions, sizeof(alloc_actions));
+  tx->alloc_actions = alloc_actions;
   tx->id = id;
   tx->prop = prop;
 }
@@ -485,7 +485,7 @@  GTM::gtm_thread::rollback (gtm_transaction_cp *cp, bool aborting)
       prop = cp->prop;
       if (cp->disp != abi_disp())
 	set_abi_disp(cp->disp);
-      memcpy(&alloc_actions, &cp->alloc_actions, sizeof(alloc_actions));
+      alloc_actions = cp->alloc_actions;
       nesting = cp->nesting;
     }
   else