diff mbox

[REPOST] Invalid Code when reading from unaligned zero-sized array

Message ID 55501474.8ce4n3Il8S@polaris
State New
Headers show

Commit Message

Eric Botcazou Dec. 6, 2013, 9:11 a.m. UTC
> Here's the Correct Fix(tm).  We may or may not decide to go for it because
> of concerns about ABI changes; in the latter case, any kludge that we'll
> put in place instead must be restricted to the cases caught by this patch.
> 
> 
> 	* stor-layout.c (compute_record_mode): Return BLKmode for a trailing
>  	array with size 0 or 1.

Revised version without the one-by-one error...

Comments

Richard Biener Dec. 6, 2013, 10 a.m. UTC | #1
On Fri, Dec 6, 2013 at 10:11 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> Here's the Correct Fix(tm).  We may or may not decide to go for it because
>> of concerns about ABI changes; in the latter case, any kludge that we'll
>> put in place instead must be restricted to the cases caught by this patch.
>>
>>
>>       * stor-layout.c (compute_record_mode): Return BLKmode for a trailing
>>       array with size 0 or 1.
>
> Revised version without the one-by-one error...

It's not fully fixing the issue as _all_ aggregates that may be
accessed beyond their
declarations size are broken.

I'd say we should simply stop giving aggregates a mode besides BLKmode.  What
can possibly break with that ...

struct { char c[4]; }

has SImode, we accept all trailing arrays as possibly extending beyond the
struct declaration.

Alternatively all structs with aggregate members should not have a
mode != BLKmode.

Previously you said it doesn't have ABI impacts and I doubt it has optimization
impacts.

Richard.

> --
> Eric Botcazou
Jeff Law Dec. 6, 2013, 9:25 p.m. UTC | #2
On 12/06/13 02:11, Eric Botcazou wrote:
>> Here's the Correct Fix(tm).  We may or may not decide to go for it because
>> of concerns about ABI changes; in the latter case, any kludge that we'll
>> put in place instead must be restricted to the cases caught by this patch.
>>
>>
>> 	* stor-layout.c (compute_record_mode): Return BLKmode for a trailing
>>   	array with size 0 or 1.
>
> Revised version without the one-by-one error...
I'd certainly be concerned.  Ports have (for better or worse) keyed on 
BLKmode rather than looking at the underlying types.  So if something 
which was previously SImode or DImode is now BLKmode, there's a nonzero 
chance we're going to change how it gets passed.

jeff
Eric Botcazou Dec. 7, 2013, 10:32 a.m. UTC | #3
> It's not fully fixing the issue as _all_ aggregates that may be
> accessed beyond their declarations size are broken.

Sure, but we don't need to support such nonsense in the general case.  And not 
every language allows it, for example in Ada you cannot do that of course.

> I'd say we should simply stop giving aggregates a mode besides BLKmode. 
> What can possibly break with that ...

Nothing, but this will unnecessarily pessimize well-behaved languages, e.g. in 
Ada we overalign structures in some cases to give them integral modes.

> struct { char c[4]; }
> 
> has SImode, we accept all trailing arrays as possibly extending beyond the
> struct declaration.
>
> Alternatively all structs with aggregate members should not have a
> mode != BLKmode.

This could work as well, although I'd restrict this to arrays, recursively.
Eric Botcazou Dec. 7, 2013, 10:44 a.m. UTC | #4
> I'd certainly be concerned.  Ports have (for better or worse) keyed on
> BLKmode rather than looking at the underlying types.  So if something
> which was previously SImode or DImode is now BLKmode, there's a nonzero
> chance we're going to change how it gets passed.

Well, we have been saying that calling conventions need to be keyed on types 
rather than modes for more than a decade...  I recall auditing and fixing the 
SPARC back-end circa 2003, so how long are we going to use this argument?

That being said, the concern is certainly valid so we may want to go for a 
kludge instead of the fix.  The point is that the kludge should do exactly 
what the fix would have done in the RTL expander and nothing more; it's out of 
question to pessimize all the other languages and all the other cases in the C 
family of languages for highly artificial testcases using non-portable code.
Eric Botcazou Dec. 7, 2013, 11:31 a.m. UTC | #5
> That being said, the concern is certainly valid so we may want to go for a
> kludge instead of the fix.  The point is that the kludge should do exactly
> what the fix would have done in the RTL expander and nothing more; it's out
> of question to pessimize all the other languages and all the other cases in
> the C family of languages for highly artificial testcases using
> non-portable code.

What about:
 1. a new boolean language hook support_trailing_arrays,
 2. a new flag on RECORD_OR_UNION_TYPE_P types,
 3. modifying stor-layout.c so that it sets the new flag on the appropriate 
types when support_trailing_arrays is true,
 4. modifying expr.c along the lines of Bernd's latest idea to avoid returning 
anything but a MEM for objects with a type on which the flag is set?
Richard Biener Dec. 8, 2013, 9:28 a.m. UTC | #6
Eric Botcazou <ebotcazou@adacore.com> wrote:
>> It's not fully fixing the issue as _all_ aggregates that may be
>> accessed beyond their declarations size are broken.
>
>Sure, but we don't need to support such nonsense in the general case. 
>And not 
>every language allows it, for example in Ada you cannot do that of
>course.

Well, we certainly need to support it as far as not ICEing

>> I'd say we should simply stop giving aggregates a mode besides
>BLKmode. 
>> What can possibly break with that ...
>
>Nothing, but this will unnecessarily pessimize well-behaved languages,
>e.g. in 
>Ada we overalign structures in some cases to give them integral modes.

What are the transformations that are enabled by making something not BLKmode?

On the gimple level I cannot think of one..

>> struct { char c[4]; }
>> 
>> has SImode, we accept all trailing arrays as possibly extending
>beyond the
>> struct declaration.
>>
>> Alternatively all structs with aggregate members should not have a
>> mode != BLKmode.
>
>This could work as well, although I'd restrict this to arrays,
>recursively.

Works for me.

Thanks,
Richard.
Richard Biener Dec. 8, 2013, 9:32 a.m. UTC | #7
Eric Botcazou <ebotcazou@adacore.com> wrote:
>> That being said, the concern is certainly valid so we may want to go
>for a
>> kludge instead of the fix.  The point is that the kludge should do
>exactly
>> what the fix would have done in the RTL expander and nothing more;
>it's out
>> of question to pessimize all the other languages and all the other
>cases in
>> the C family of languages for highly artificial testcases using
>> non-portable code.
>
>What about:
> 1. a new boolean language hook support_trailing_arrays,
> 2. a new flag on RECORD_OR_UNION_TYPE_P types,
>3. modifying stor-layout.c so that it sets the new flag on the
>appropriate 
>types when support_trailing_arrays is true,
>4. modifying expr.c along the lines of Bernd's latest idea to avoid
>returning 
>anything but a MEM for objects with a type on which the flag is set?

Sounds more complicated than the other two options. Fix the types mode or add the expand flag.

I'm ok with either variant and I'm not worried about the ABI thing too much for the important archs.

Richard.
Jeff Law Dec. 9, 2013, 4:03 a.m. UTC | #8
On 12/07/13 03:44, Eric Botcazou wrote:
>> I'd certainly be concerned.  Ports have (for better or worse) keyed on
>> BLKmode rather than looking at the underlying types.  So if something
>> which was previously SImode or DImode is now BLKmode, there's a nonzero
>> chance we're going to change how it gets passed.
>
> Well, we have been saying that calling conventions need to be keyed on types
> rather than modes for more than a decade...  I recall auditing and fixing the
> SPARC back-end circa 2003, so how long are we going to use this argument?
I don't recall such a change in policy -- but that doesn't mean it 
didn't happen :-)

If we already declared that ports should be looking at the underlying 
types rather than the mode and it's been in place that long, then I 
think any port that hasn't been audited/updated deserves its fate and we 
shouldn't let them stand in the way of making progress on this issue.

jeff
Bernd Edlinger Dec. 9, 2013, 9:07 a.m. UTC | #9
> On 12/07/13 03:44, Eric Botcazou wrote:
>>> I'd certainly be concerned. Ports have (for better or worse) keyed on
>>> BLKmode rather than looking at the underlying types. So if something
>>> which was previously SImode or DImode is now BLKmode, there's a nonzero
>>> chance we're going to change how it gets passed.
>>
>> Well, we have been saying that calling conventions need to be keyed on types
>> rather than modes for more than a decade... I recall auditing and fixing the
>> SPARC back-end circa 2003, so how long are we going to use this argument?
> I don't recall such a change in policy -- but that doesn't mean it
> didn't happen :-)
>
> If we already declared that ports should be looking at the underlying
> types rather than the mode and it's been in place that long, then I
> think any port that hasn't been audited/updated deserves its fate and we
> shouldn't let them stand in the way of making progress on this issue.
>
> jeff

The ports are one thing, but this non-BLKmode thing can also be visible to
the user code:

If we have

struct s{ char x[8]; };

We can place that in a register like this:

register struct s x __asm__("eax");

Previous experimentation showed that this code can break with that ABI change.

So if we change that structure mode, we should make sure that this has no other
unexpected implications, especially in phase 3.

Bernd.
Richard Biener Dec. 9, 2013, 9:57 a.m. UTC | #10
On Mon, Dec 9, 2013 at 10:07 AM, Bernd Edlinger
<bernd.edlinger@hotmail.de> wrote:
>> On 12/07/13 03:44, Eric Botcazou wrote:
>>>> I'd certainly be concerned. Ports have (for better or worse) keyed on
>>>> BLKmode rather than looking at the underlying types. So if something
>>>> which was previously SImode or DImode is now BLKmode, there's a nonzero
>>>> chance we're going to change how it gets passed.
>>>
>>> Well, we have been saying that calling conventions need to be keyed on types
>>> rather than modes for more than a decade... I recall auditing and fixing the
>>> SPARC back-end circa 2003, so how long are we going to use this argument?
>> I don't recall such a change in policy -- but that doesn't mean it
>> didn't happen :-)
>>
>> If we already declared that ports should be looking at the underlying
>> types rather than the mode and it's been in place that long, then I
>> think any port that hasn't been audited/updated deserves its fate and we
>> shouldn't let them stand in the way of making progress on this issue.
>>
>> jeff
>
> The ports are one thing, but this non-BLKmode thing can also be visible to
> the user code:
>
> If we have
>
> struct s{ char x[8]; };
>
> We can place that in a register like this:
>
> register struct s x __asm__("eax");
>
> Previous experimentation showed that this code can break with that ABI change.
>
> So if we change that structure mode, we should make sure that this has no other
> unexpected implications, especially in phase 3.

We supposedly can force the mode of the DECL to be that of EAX
at the declaration, even if the mode of the type is BLKmode.

Richard.

> Bernd.
Eric Botcazou Dec. 10, 2013, 10:04 a.m. UTC | #11
> What are the transformations that are enabled by making something not
> BLKmode?
> 
> On the gimple level I cannot think of one..

On the RTL level, it's simple: anything BLKmode is forced to memory instead of 
being loaded into registers.

> >This could work as well, although I'd restrict this to arrays,
> >recursively.
>
> Works for me.

Are we sure that we really support out-of-bounds accesses for arbitrary arrays 
though?  It seems to me that we easily take advantage of them e.g. in loops to 
invoke undefined behavior.  IMO it's not clear whether we want to risk more 
accidental ABI changes if they are not supported throughout the compiler.
Jakub Jelinek Dec. 10, 2013, 10:22 a.m. UTC | #12
On Tue, Dec 10, 2013 at 11:04:53AM +0100, Eric Botcazou wrote:
> > What are the transformations that are enabled by making something not
> > BLKmode?
> > 
> > On the gimple level I cannot think of one..
> 
> On the RTL level, it's simple: anything BLKmode is forced to memory instead of 
> being loaded into registers.
> 
> > >This could work as well, although I'd restrict this to arrays,
> > >recursively.
> >
> > Works for me.
> 
> Are we sure that we really support out-of-bounds accesses for arbitrary arrays 
> though?  It seems to me that we easily take advantage of them e.g. in loops to 
> invoke undefined behavior.  IMO it's not clear whether we want to risk more 
> accidental ABI changes if they are not supported throughout the compiler.

I think we don't support out-of-bounds accesses for global vars (ok, there
are vars with real flexible array members as GNU extension initialized by
initializers that initialize the flexible array members, but then the decl
size etc. should be adjusted for that), similarly for automatic vars (and in
both cases that includes the hard register vars).  What we support is
out of bounds accesses for heap vars if the var's type has flexible array member
or something we treat similarly and there is the possibility that there
could be payload after the heap var that could be accessed from the flexible
array members or similar arrays.  And of course if you have just pointer to
some var and can't see what it points to, we need to treat it as if it could
be heap var.

So, I don't see what is the big deal with BLKmode, because all the cases
which actually could have flexible array member extra payloads (or similar)
must necessarily live in memory, if it is the compiler that decides whether
to put it into memory or keep in registers etc., then it can't be heap
allocated.

	Jakub
Eric Botcazou Dec. 10, 2013, 10:53 a.m. UTC | #13
> What we support is out of bounds accesses for heap vars if the var's type
> has flexible array member or something we treat similarly and there is the
> possibility that there could be payload after the heap var that could be 
> accessed from the flexible array members or similar arrays.

My question was about the above similar arrays, i.e. whether we consider all 
trailing arrays in structures as flexible-like or not.  No strong opinion.

> So, I don't see what is the big deal with BLKmode, because all the cases
> which actually could have flexible array member extra payloads (or similar)
> must necessarily live in memory, if it is the compiler that decides whether
> to put it into memory or keep in registers etc., then it can't be heap
> allocated.

The invariant is that types for which objects can effectively have variable 
size must have BLKmode, otherwise you need to add very ugly code in the RTL 
expander to mask the lie.
Richard Biener Dec. 10, 2013, 3:02 p.m. UTC | #14
On Tue, Dec 10, 2013 at 11:53 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> What we support is out of bounds accesses for heap vars if the var's type
>> has flexible array member or something we treat similarly and there is the
>> possibility that there could be payload after the heap var that could be
>> accessed from the flexible array members or similar arrays.
>
> My question was about the above similar arrays, i.e. whether we consider all
> trailing arrays in structures as flexible-like or not.  No strong opinion.

Yes we do, even for struct { struct { int a; char a[1] } }; (note the not really
"trailing" as there is padding after the trailing array).  We do take
size limitations from a DECL (if we see one) into account to limit the
effect of this trailing-array-supporting, so it effectively only applies to
indirect accesses (and the padding example above, you can use the whole
padding if DECL_SIZE allows that).

>> So, I don't see what is the big deal with BLKmode, because all the cases
>> which actually could have flexible array member extra payloads (or similar)
>> must necessarily live in memory, if it is the compiler that decides whether
>> to put it into memory or keep in registers etc., then it can't be heap
>> allocated.
>
> The invariant is that types for which objects can effectively have variable
> size must have BLKmode, otherwise you need to add very ugly code in the RTL
> expander to mask the lie.

I wonder if we can make the expander more rely on the DECLs mode
and optimize only the DECLs mode (if we can constrain its size via
DECL_SIZE) to non-BLKmode instead of doing that for the TYPEs mode.
Yes, you'd have DECL_MODE != TYPE_MODE that way.

Or rather I wonder if the expander doesn't already work that way
(looks at DECL_MODE).

Richard.

> --
> Eric Botcazou
diff mbox

Patch

Index: stor-layout.c
===================================================================
--- stor-layout.c	(revision 205727)
+++ stor-layout.c	(working copy)
@@ -1605,6 +1605,17 @@  compute_record_mode (tree type)
 	  || ! tree_fits_uhwi_p (DECL_SIZE (field)))
 	return;
 
+      /* As a GNU extension, we support out-of-bounds accesses for a trailing
+	 array with size 0 or 1.  In this case, the record type effectively
+	 has variable size so it needs to have BLKmode.  */
+      if (!DECL_CHAIN (field) && TREE_CODE (TREE_TYPE (field)) == ARRAY_TYPE)
+	{
+	  tree domain_type = TYPE_DOMAIN (TREE_TYPE (field));
+	  if (!TYPE_MAX_VALUE (domain_type)
+	      || integer_zerop (TYPE_MAX_VALUE (domain_type)))
+	    return;
+	}
+
       /* If this field is the whole struct, remember its mode so
 	 that, say, we can put a double in a class into a DF
 	 register instead of forcing it to live in the stack.  */