diff mbox series

tolerate padding in mbstate_t

Message ID oriml4fhtf.fsf@livre.home
State New
Headers show
Series tolerate padding in mbstate_t | expand

Commit Message

Alexandre Oliva Jan. 22, 2020, 12:36 a.m. UTC
Padding in mbstate_t objects may get the memcmp to fail.
Attempt to avoid the failure with zero initialization.


Regstrapped on x86_64-linux-gnu, and also tested on a platform that used
to fail because of padding in std::mbstate_t.  Ok to install?


for  libstdc++-v3/ChangeLog

	* testsuite/27_io/fpos/mbstate_t/1.cc: Zero-init mbstate_t.
---
 testsuite/27_io/fpos/mbstate_t/1.cc |   28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

Comments

Jonathan Wakely Jan. 22, 2020, 12:57 a.m. UTC | #1
On 21/01/20 21:36 -0300, Alexandre Oliva wrote:
>
>Padding in mbstate_t objects may get the memcmp to fail.
>Attempt to avoid the failure with zero initialization.
>
>
>Regstrapped on x86_64-linux-gnu, and also tested on a platform that used
>to fail because of padding in std::mbstate_t.  Ok to install?

Under what conditions does this fail? Only for -std=gnu++98 and not
later standards, I assume?

Because since C++11 state_type() does perform zero-initialization of
the whole object (including padding) even if it has a default
constructor.


>
>for  libstdc++-v3/ChangeLog
>
>	* testsuite/27_io/fpos/mbstate_t/1.cc: Zero-init mbstate_t.
>---
> testsuite/27_io/fpos/mbstate_t/1.cc |   28 ++++++++++++++++++++++------
> 1 file changed, 22 insertions(+), 6 deletions(-)
>
>diff --git libstdc++-v3/testsuite/27_io/fpos/mbstate_t/1.cc libstdc++-v3/testsuite/27_io/fpos/mbstate_t/1.cc
>index f92d68f..559bd8d 100644
>--- libstdc++-v3/testsuite/27_io/fpos/mbstate_t/1.cc
>+++ libstdc++-v3/testsuite/27_io/fpos/mbstate_t/1.cc
>@@ -28,8 +28,24 @@
> void test01()
> {
>   typedef std::mbstate_t state_type;
>-  state_type state01 = state_type();
>-  state_type state02 = state_type();
>+  // Use zero-initialization of the underlying memory so that padding
>+  // bytes, if any, stand a better chance of comparing the same.
>+  // Zero-initialized memory is guaranteed to be a valid initial
>+  // state.  This doesn't quite guarantee that any padding bits won't
>+  // be overwritten when copying from other instances that haven't
>+  // been fully initialized: this data type is compatible with C, so
>+  // it is likely plain old data, but it could have a default ctor
>+  // that initializes only the relevant fields, whereas copy-ctor and
>+  // operator= could be implemented as a full-object memcpy, including
>+  // padding bits, rather than fieldwise copying.  However, since
>+  // we're comparing two values copied from the same state_type
>+  // instance (or was this meant to take one of them from pos02 rather
>+  // than both from pos01?),

I don't think so, that wouldn't work. I think pos02 could just be
removed from the test.
Alexandre Oliva Jan. 22, 2020, 2:04 a.m. UTC | #2
On Jan 21, 2020, Jonathan Wakely <jwakely@redhat.com> wrote:

> On 21/01/20 21:36 -0300, Alexandre Oliva wrote:
>> 
>> Padding in mbstate_t objects may get the memcmp to fail.
>> Attempt to avoid the failure with zero initialization.
>> 
>> 
>> Regstrapped on x86_64-linux-gnu, and also tested on a platform that used
>> to fail because of padding in std::mbstate_t.  Ok to install?

> Under what conditions does this fail? Only for -std=gnu++98 and not
> later standards, I assume?

> Because since C++11 state_type() does perform zero-initialization of
> the whole object (including padding) even if it has a default
> constructor.

Err, IIUC it does so only for defaulted ctors; a user-supplied default
ctor (as in the target system) may leave padding bits uninitialized.
Indeed, compiling the following snippet without optimization, with or
without -DCTOR, on x86_64, uses movw to initialize bar and the rest of
the word remains uninitialized with -DCTOR, even with -std=c++17, as we
do without -DCTOR with -std=c++98, whereas without -DCTOR and
-std=c++1[17] we use movq.

struct t {
  long foo;
  short bar;
#if CTOR
  t() : foo(0), bar(0) {}
#endif
};
t f() {
  t v = t();
  return v;
}

When optimiizing, we end up using movq in all cases, but that's probably
because of SRA.


> I don't think so, that wouldn't work. I think pos02 could just be
> removed from the test.

Will do.
Jonathan Wakely Jan. 22, 2020, 10:29 a.m. UTC | #3
On 21/01/20 23:04 -0300, Alexandre Oliva wrote:
>On Jan 21, 2020, Jonathan Wakely <jwakely@redhat.com> wrote:
>
>> On 21/01/20 21:36 -0300, Alexandre Oliva wrote:
>>>
>>> Padding in mbstate_t objects may get the memcmp to fail.
>>> Attempt to avoid the failure with zero initialization.
>>>
>>>
>>> Regstrapped on x86_64-linux-gnu, and also tested on a platform that used
>>> to fail because of padding in std::mbstate_t.  Ok to install?
>
>> Under what conditions does this fail? Only for -std=gnu++98 and not
>> later standards, I assume?
>
>> Because since C++11 state_type() does perform zero-initialization of
>> the whole object (including padding) even if it has a default
>> constructor.
>
>Err, IIUC it does so only for defaulted ctors; a user-supplied default
>ctor (as in the target system) may leave padding bits uninitialized.

Yes, I misremembered.

>Indeed, compiling the following snippet without optimization, with or
>without -DCTOR, on x86_64, uses movw to initialize bar and the rest of
>the word remains uninitialized with -DCTOR, even with -std=c++17, as we
>do without -DCTOR with -std=c++98, whereas without -DCTOR and
>-std=c++1[17] we use movq.
>
>struct t {
>  long foo;
>  short bar;
>#if CTOR
>  t() : foo(0), bar(0) {}
>#endif
>};
>t f() {
>  t v = t();
>  return v;
>}

Right. I was confusing it with this case:

struct t2 : t { };

t2 g() {
   t2 v = t2();
   return v;
}

This one *does* zero-initialize the padding in struct t, even though
it has a user-provided default constructor.

>When optimiizing, we end up using movq in all cases, but that's probably
>because of SRA.
>
>
>> I don't think so, that wouldn't work. I think pos02 could just be
>> removed from the test.
>
>Will do.

Thanks, OK to commit.
Alexandre Oliva Jan. 23, 2020, 7:40 p.m. UTC | #4
On Jan 22, 2020, Jonathan Wakely <jwakely@redhat.com> wrote:

>>> I don't think so, that wouldn't work. I think pos02 could just be
>>> removed from the test.

>> Will do.

> Thanks, OK to commit.

Thanks, here's what I tested and am about to install.


tolerate padding in mbstate_t

From: Alexandre Oliva <oliva@adacore.com>

Padding in mbstate_t objects may get the memcmp to fail.
Attempt to avoid the failure with zero initialization.


for  libstdc++-v3/ChangeLog

	* testsuite/27_io/fpos/mbstate_t/1.cc: Zero-init mbstate_t.
---
 libstdc++-v3/testsuite/27_io/fpos/mbstate_t/1.cc |   28 +++++++++++++++++-----
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/libstdc++-v3/testsuite/27_io/fpos/mbstate_t/1.cc b/libstdc++-v3/testsuite/27_io/fpos/mbstate_t/1.cc
index f92d68f..28fec8e 100644
--- a/libstdc++-v3/testsuite/27_io/fpos/mbstate_t/1.cc
+++ b/libstdc++-v3/testsuite/27_io/fpos/mbstate_t/1.cc
@@ -28,24 +28,38 @@
 void test01()
 {
   typedef std::mbstate_t state_type;
-  state_type state01 = state_type();
-  state_type state02 = state_type();
+  // Use zero-initialization of the underlying memory so that padding
+  // bytes, if any, stand a better chance of comparing the same.
+  // Zero-initialized memory is guaranteed to be a valid initial
+  // state.  This doesn't quite guarantee that any padding bits won't
+  // be overwritten when copying from other instances that haven't
+  // been fully initialized: this data type is compatible with C, so
+  // it is likely plain old data, but it could have a default ctor
+  // that initializes only the relevant fields, whereas copy-ctor and
+  // operator= could be implemented as a full-object memcpy, including
+  // padding bits, rather than fieldwise copying.  However, since
+  // we're comparing two values copied from the same state_type
+  // instance, if padding bits are copied, we'll get the same for both
+  // of them, and if they aren't, we'll keep the values we initialized
+  // them with, so this should be good.
+  state_type state[2];
+  std::memset(state, 0, sizeof (state));
+
 
   std::streampos pos01(0);
-  std::streampos pos02(0);
 
   // 27.4.3.1 fpos members
   // void state(state_type s);
   // state_type state();
 
   // XXX Need to have better sanity checking for the mbstate_t type,
-  // or whatever the insantiating type for class fpos happens to be
+  // or whatever the instantiating type for class fpos happens to be
   // for streampos, as things like equality operators and assignment
   // operators, increment and deincrement operators need to be in
   // place.
-  pos01.state(state02);
-  state01 = pos01.state();
-  VERIFY( std::memcmp(&state01, &state02, sizeof(state_type)) == 0 );
+  pos01.state(state[1]);
+  state[0] = pos01.state();
+  VERIFY( std::memcmp(&state[0], &state[1], sizeof(state_type)) == 0 );
 }
 
 int main()
diff mbox series

Patch

diff --git libstdc++-v3/testsuite/27_io/fpos/mbstate_t/1.cc libstdc++-v3/testsuite/27_io/fpos/mbstate_t/1.cc
index f92d68f..559bd8d 100644
--- libstdc++-v3/testsuite/27_io/fpos/mbstate_t/1.cc
+++ libstdc++-v3/testsuite/27_io/fpos/mbstate_t/1.cc
@@ -28,8 +28,24 @@ 
 void test01()
 {
   typedef std::mbstate_t state_type;
-  state_type state01 = state_type();
-  state_type state02 = state_type();
+  // Use zero-initialization of the underlying memory so that padding
+  // bytes, if any, stand a better chance of comparing the same.
+  // Zero-initialized memory is guaranteed to be a valid initial
+  // state.  This doesn't quite guarantee that any padding bits won't
+  // be overwritten when copying from other instances that haven't
+  // been fully initialized: this data type is compatible with C, so
+  // it is likely plain old data, but it could have a default ctor
+  // that initializes only the relevant fields, whereas copy-ctor and
+  // operator= could be implemented as a full-object memcpy, including
+  // padding bits, rather than fieldwise copying.  However, since
+  // we're comparing two values copied from the same state_type
+  // instance (or was this meant to take one of them from pos02 rather
+  // than both from pos01?), if padding bits are copied, we'll get the
+  // same for both of them, and if they aren't, we'll keep the values
+  // we initialized them with, so this should be good.
+  state_type state[2];
+  std::memset(state, 0, sizeof (state));
+
 
   std::streampos pos01(0);
   std::streampos pos02(0);
@@ -39,13 +55,13 @@  void test01()
   // state_type state();
 
   // XXX Need to have better sanity checking for the mbstate_t type,
-  // or whatever the insantiating type for class fpos happens to be
+  // or whatever the instantiating type for class fpos happens to be
   // for streampos, as things like equality operators and assignment
   // operators, increment and deincrement operators need to be in
   // place.
-  pos01.state(state02);
-  state01 = pos01.state();
-  VERIFY( std::memcmp(&state01, &state02, sizeof(state_type)) == 0 );
+  pos01.state(state[1]);
+  state[0] = pos01.state();
+  VERIFY( std::memcmp(&state[0], &state[1], sizeof(state_type)) == 0 );
 }
 
 int main()