Message ID | 503CE088.70208@salomon.at |
---|---|
State | New |
Headers | show |
On 28 August 2012 16:15, Michael Haubenwallner wrote: > Hi, > > in some old, large, originally C-written application (using gcc-4.2.4 still) > I did have to find a bug that boils down to something like this: > > std::string x; > strcpy( (char*) x.c_str(), "abc"); > > Any subsequent empty std::string instance did contain "abc" instead of "", > which was the issue showing up to the user. > > My idea what could have helped out here was to make the empty string _Rep > object readonly (ie. const), to get a segmentation fault along the strcpy. Does it actually produce a segfault? I suppose it might on some platforms, but not all, so I'm not sure it's worth changing. (It seems easier to simply track down and shoot the programmer who thought it was OK to cast away const on the string! ;-)
On 08/28/2012 06:46 PM, Jonathan Wakely wrote: > On 28 August 2012 16:15, Michael Haubenwallner wrote: >> Hi, >> >> in some old, large, originally C-written application (using gcc-4.2.4 still) >> I did have to find a bug that boils down to something like this: >> >> std::string x; >> strcpy( (char*) x.c_str(), "abc"); >> >> Any subsequent empty std::string instance did contain "abc" instead of "", >> which was the issue showing up to the user. >> >> My idea what could have helped out here was to make the empty string _Rep >> object readonly (ie. const), to get a segmentation fault along the strcpy. > > Does it actually produce a segfault? I suppose it might on some > platforms, but not all, so I'm not sure it's worth changing. It does segfault here on (32bit each): i686-pc-linux-gnu ia64-hp-hpux11.31 i386-pc-solaris2.10 sparc-sun-solaris2.10 powerpc-ibm-aix5.3.0.0 powerpc-ibm-aix6.1.0.0 powerpc-ibm-aix7.1.0.0 It does not segfault here on: hppa2.0n-hp-hpux11.31 i586-pc-interix5.2 i586-pc-winnt5.2 (using MSVC) Maybe it could be made segfault on hppa2.0n-hp-hpux11.31 too using some linker flag, but that's a deprecated platform anyway. As long as the major development platform (Linux) does segfault, it feels worth changing - especially as string.clear() to write the '\0' back again won't help as quick'n dirty workaround since gcc-4.4.4 any more. Also, a segfault is always better than random behaviour, even in production. > (It seems easier to simply track down and shoot the programmer who > thought it was OK to cast away const on the string! ;-) I'm in doubt there isn't just one occurrence/developer with this bug... Thank you! /haubi/
On 28 August 2012 18:27, Michael Haubenwallner wrote: >> >> Does it actually produce a segfault? I suppose it might on some >> platforms, but not all, so I'm not sure it's worth changing. > > It does segfault here on (32bit each): > i686-pc-linux-gnu > ia64-hp-hpux11.31 > i386-pc-solaris2.10 > sparc-sun-solaris2.10 > powerpc-ibm-aix5.3.0.0 > powerpc-ibm-aix6.1.0.0 > powerpc-ibm-aix7.1.0.0 > > It does not segfault here on: > hppa2.0n-hp-hpux11.31 > i586-pc-interix5.2 > i586-pc-winnt5.2 (using MSVC) > > Maybe it could be made segfault on hppa2.0n-hp-hpux11.31 too using some linker flag, > but that's a deprecated platform anyway. > > As long as the major development platform (Linux) does segfault, it feels worth > changing - especially as string.clear() to write the '\0' back again won't help > as quick'n dirty workaround since gcc-4.4.4 any more. Hmm, I tested it on x86_64-unknown-linux-gnu without getting a segfault - but I might have messed up my test.
On 08/28/2012 08:12 PM, Jonathan Wakely wrote: > On 28 August 2012 18:27, Michael Haubenwallner wrote: >>> >>> Does it actually produce a segfault? I suppose it might on some >>> platforms, but not all, so I'm not sure it's worth changing. >> >> It does segfault here on (32bit each): >> i686-pc-linux-gnu >> ia64-hp-hpux11.31 >> i386-pc-solaris2.10 >> sparc-sun-solaris2.10 >> powerpc-ibm-aix5.3.0.0 >> powerpc-ibm-aix6.1.0.0 >> powerpc-ibm-aix7.1.0.0 >> >> It does not segfault here on: >> hppa2.0n-hp-hpux11.31 >> i586-pc-interix5.2 >> i586-pc-winnt5.2 (using MSVC) >> >> Maybe it could be made segfault on hppa2.0n-hp-hpux11.31 too using some linker flag, >> but that's a deprecated platform anyway. >> >> As long as the major development platform (Linux) does segfault, it feels worth >> changing - especially as string.clear() to write the '\0' back again won't help >> as quick'n dirty workaround since gcc-4.4.4 any more. > > Hmm, I tested it on x86_64-unknown-linux-gnu without getting a > segfault - but I might have messed up my test. Using this patch on my x86_64 Gentoo Linux Desktop with gcc-4.7.1 does segfault as expected - when I make sure the correct libstdc++ is used at runtime, having the '_S_empty_rep_storage' symbol in the .rodata section rather than .bss. /haubi/
On 29 August 2012 13:25, Michael Haubenwallner wrote: > > On 08/28/2012 08:12 PM, Jonathan Wakely wrote: >> On 28 August 2012 18:27, Michael Haubenwallner wrote: >>>> >>>> Does it actually produce a segfault? I suppose it might on some >>>> platforms, but not all, so I'm not sure it's worth changing. >>> >>> It does segfault here on (32bit each): >>> i686-pc-linux-gnu >>> ia64-hp-hpux11.31 >>> i386-pc-solaris2.10 >>> sparc-sun-solaris2.10 >>> powerpc-ibm-aix5.3.0.0 >>> powerpc-ibm-aix6.1.0.0 >>> powerpc-ibm-aix7.1.0.0 >>> >>> It does not segfault here on: >>> hppa2.0n-hp-hpux11.31 >>> i586-pc-interix5.2 >>> i586-pc-winnt5.2 (using MSVC) >>> >>> Maybe it could be made segfault on hppa2.0n-hp-hpux11.31 too using some linker flag, >>> but that's a deprecated platform anyway. >>> >>> As long as the major development platform (Linux) does segfault, it feels worth >>> changing - especially as string.clear() to write the '\0' back again won't help >>> as quick'n dirty workaround since gcc-4.4.4 any more. >> >> Hmm, I tested it on x86_64-unknown-linux-gnu without getting a >> segfault - but I might have messed up my test. > > Using this patch on my x86_64 Gentoo Linux Desktop with gcc-4.7.1 does segfault > as expected - when I make sure the correct libstdc++ is used at runtime, > having the '_S_empty_rep_storage' symbol in the .rodata section rather than .bss. Bah, I did mess up my test, not correctly disabling the extern template instantiations in the library. If it works reliably on x86_64 then I think the patch is worth considering. I'm on holiday for a week, so maybe one of the other maintainers will deal with it first.
On 08/30/2012 11:45 AM, Jonathan Wakely wrote: > On 29 August 2012 13:25, Michael Haubenwallner wrote: >> On 08/28/2012 08:12 PM, Jonathan Wakely wrote: >>> On 28 August 2012 18:27, Michael Haubenwallner wrote: >>>>> >>>>> Does it actually produce a segfault? I suppose it might on some >>>>> platforms, but not all, so I'm not sure it's worth changing. >> >> Using this patch on my x86_64 Gentoo Linux Desktop with gcc-4.7.1 does segfault >> as expected - when I make sure the correct libstdc++ is used at runtime, >> having the '_S_empty_rep_storage' symbol in the .rodata section rather than .bss. > > If it works reliably on x86_64 then I think the patch is worth considering. > > I'm on holiday for a week, so maybe one of the other maintainers will > deal with it first. Any chance to get this in for 4.8? Thank you! /haubi/
On 30 October 2012 09:05, Michael Haubenwallner wrote:
> Any chance to get this in for 4.8?
I'm looking into it today.
On 30 October 2012 09:28, Jonathan Wakely wrote: > On 30 October 2012 09:05, Michael Haubenwallner wrote: >> Any chance to get this in for 4.8? > > I'm looking into it today. Consider the case where one object file containing std::string().erase() is built with an older GCC without the fix for PR 40518, then it's linked to a new libstdc++.so where the empty rep is read-only. The program will attempt to write to the empty rep, but now it's read-only and will crash. I don't think we can apply it unless we change the library ABI so that no pre-PR40518 objects can link to a libstdc++.so containing a read-only empty rep.
On 30 October 2012 10:11, Jonathan Wakely wrote: > On 30 October 2012 09:28, Jonathan Wakely wrote: >> On 30 October 2012 09:05, Michael Haubenwallner wrote: >>> Any chance to get this in for 4.8? >> >> I'm looking into it today. > > Consider the case where one object file containing > std::string().erase() is built with an older GCC without the fix for > PR 40518, then it's linked to a new libstdc++.so where the empty rep > is read-only. The program will attempt to write to the empty rep, but > now it's read-only and will crash. I don't think we can apply it > unless we change the library ABI so that no pre-PR40518 objects can > link to a libstdc++.so containing a read-only empty rep. If I can convince myself this can't happen then I'll commit it, but I need to look into it further this evening.
From d16c5d0cabbe9b308f852534be0a209b5b2c5cfd Mon Sep 17 00:00:00 2001 From: Michael Haubenwallner <michael.haubenwallner@salomon.at> Date: Tue, 28 Aug 2012 17:05:48 +0200 Subject: [PATCH] Make default string storage readonly. To help finding bugs that write to the empty string storage using C string manipulation functions like strcpy, define the empty string storage as const, resulting in a segmentation fault with such strcpy, which is ways better than having subsequent empty strings non-empty. --- libstdc++-v3/ChangeLog | 6 ++++++ libstdc++-v3/include/bits/basic_string.h | 4 ++-- libstdc++-v3/include/bits/basic_string.tcc | 4 ++-- 3 files changed, 10 insertions(+), 4 deletions(-) diff --git a/libstdc++-v3/ChangeLog b/libstdc++-v3/ChangeLog index c2fcddc..86a5781 100644 --- a/libstdc++-v3/ChangeLog +++ b/libstdc++-v3/ChangeLog @@ -1,3 +1,9 @@ +2012-08-28 Michael Haubenwallner <michael.haubenwallner@salomon.at> + + (_S_empty_rep_storage): Make default string storage readonly. + * include/bits/basic_string.h: Declare as const. + * include/bits/basic_string.tcc: Define the const value. + 2012-08-27 Ulrich Drepper <drepper@gmail.com> Add interfaces to retrieve random numbers in bulk. diff --git a/libstdc++-v3/include/bits/basic_string.h b/libstdc++-v3/include/bits/basic_string.h index 24562c4..2d9b742 100644 --- a/libstdc++-v3/include/bits/basic_string.h +++ b/libstdc++-v3/include/bits/basic_string.h @@ -177,7 +177,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // The following storage is init'd to 0 by the linker, resulting // (carefully) in an empty string with one reference. - static size_type _S_empty_rep_storage[]; + static size_type const _S_empty_rep_storage[]; static _Rep& _S_empty_rep() @@ -185,7 +185,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // NB: Mild hack to avoid strict-aliasing warnings. Note that // _S_empty_rep_storage is never modified and the punning should // be reasonably safe in this case. - void* __p = reinterpret_cast<void*>(&_S_empty_rep_storage); + void* __p = const_cast<void*>(reinterpret_cast<void const*>(&_S_empty_rep_storage)); return *reinterpret_cast<_Rep*>(__p); } diff --git a/libstdc++-v3/include/bits/basic_string.tcc b/libstdc++-v3/include/bits/basic_string.tcc index 7eff818..945ce1c 100644 --- a/libstdc++-v3/include/bits/basic_string.tcc +++ b/libstdc++-v3/include/bits/basic_string.tcc @@ -64,10 +64,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // Linker sets _S_empty_rep_storage to all 0s (one reference, empty string) // at static init time (before static ctors are run). template<typename _CharT, typename _Traits, typename _Alloc> - typename basic_string<_CharT, _Traits, _Alloc>::size_type + typename basic_string<_CharT, _Traits, _Alloc>::size_type const basic_string<_CharT, _Traits, _Alloc>::_Rep::_S_empty_rep_storage[ (sizeof(_Rep_base) + sizeof(_CharT) + sizeof(size_type) - 1) / - sizeof(size_type)]; + sizeof(size_type)] = {}; // NB: This is the special case for Input Iterators, used in // istreambuf_iterators, etc. -- 1.7.3.4