diff mbox series

bring -Warray-bounds closer to -Wstringop-overflow (PR91647, 91463, 91679)

Message ID f80f84af-4f40-5af6-1b5f-88feafb589cc@gmail.com
State New
Headers show
Series bring -Warray-bounds closer to -Wstringop-overflow (PR91647, 91463, 91679) | expand

Commit Message

Martin Sebor Sept. 6, 2019, 7:27 p.m. UTC
Recent enhancements to -Wstringop-overflow improved the warning
to the point that it detects a superset of the problems -Warray-
bounds is intended detect in character accesses.  Because both
warnings detect overlapping sets of problems, and because the IL
they work with tends to change in subtle ways from target to
targer, tests designed to verify one or the other sometimes fail
with a target where the warning isn't robust enough to detect
the problem given the IL representation.

To reduce these test suite failures the attached patch extends
-Warray-bounds to handle some of the same problems -Wstringop-
overflow does, pecifically, out-of-bounds accesses to array
members of structs, including zero-length arrays and flexible
array members of defined objects.

In the process of testing the enhancement I realized that
the recently added component_size() function doesn't work as
intended for non-character array members (see below).  The patch
corrects this by reverting back to the original implementation
of the function until the better/simpler solution can be put in
place as mentioned below.

Tested on x86_64-linux.

Martin


[*] component_size() happens to work for char arrays because those
are transformed to STRING_CSTs, but not for arrays that are not.
E.g., in

   struct S { int64_t i; int16_t j; int16_t a[]; }
     s = { 0, 0, { 1, 0 } };

unless called with type set to int16_t[2], fold_ctor_reference
will return s.a[0] rather than all of s.a.  But set type to
int16_t[2] we would need to know that s.a's initializer has two
elements, and that's just what we're using fold_ctor_reference
to find out.

I think this could probably be made to work somehow by extending
useless_type_conversion_p to handle this case as special somehow,
but it doesn't seem worth the effort given that there should be
an easier way to do it as you noted below.

Given the above, the long term solution should be to rely on
DECL_SIZE_UNIT(decl) - TYPE_SIZE_UNIT(decl_type) as Richard
suggested in the review of its initial implementation.
Unfortunately, because of bugs in both the C and C++ front ends
(I just opened PR 65403 with the details) the simple formula
doesn't give the right answers either.  So until the bugs are
fixed, the patch reverts back to the original loopy solution.
It's no more costly than the current fold_ctor_reference
approach.

Comments

Martin Sebor Sept. 6, 2019, 11:26 p.m. UTC | #1
Just a heads up that I tested the patch with Glibc and the kernel.
It exposes some of the same "abuses" of (near) zero-length arrays
as the most recent improvement in this area.  In glibc, it
complains about code in fileops.c, iofwide.c, libc-tls.c, and
rtld.c.  The ones I looked at all look like the last one we saw.
I'll look into how to deal with them next.  In the kernel it
issues a variety of warnings that I need to investigate after
I get back from Cauldron.

On 9/6/19 1:27 PM, Martin Sebor wrote:
> Recent enhancements to -Wstringop-overflow improved the warning
> to the point that it detects a superset of the problems -Warray-
> bounds is intended detect in character accesses.  Because both
> warnings detect overlapping sets of problems, and because the IL
> they work with tends to change in subtle ways from target to
> targer, tests designed to verify one or the other sometimes fail
> with a target where the warning isn't robust enough to detect
> the problem given the IL representation.
> 
> To reduce these test suite failures the attached patch extends
> -Warray-bounds to handle some of the same problems -Wstringop-
> overflow does, pecifically, out-of-bounds accesses to array
> members of structs, including zero-length arrays and flexible
> array members of defined objects.
> 
> In the process of testing the enhancement I realized that
> the recently added component_size() function doesn't work as
> intended for non-character array members (see below).  The patch
> corrects this by reverting back to the original implementation
> of the function until the better/simpler solution can be put in
> place as mentioned below.
> 
> Tested on x86_64-linux.
> 
> Martin
> 
> 
> [*] component_size() happens to work for char arrays because those
> are transformed to STRING_CSTs, but not for arrays that are not.
> E.g., in
> 
>    struct S { int64_t i; int16_t j; int16_t a[]; }
>      s = { 0, 0, { 1, 0 } };
> 
> unless called with type set to int16_t[2], fold_ctor_reference
> will return s.a[0] rather than all of s.a.  But set type to
> int16_t[2] we would need to know that s.a's initializer has two
> elements, and that's just what we're using fold_ctor_reference
> to find out.
> 
> I think this could probably be made to work somehow by extending
> useless_type_conversion_p to handle this case as special somehow,
> but it doesn't seem worth the effort given that there should be
> an easier way to do it as you noted below.
> 
> Given the above, the long term solution should be to rely on
> DECL_SIZE_UNIT(decl) - TYPE_SIZE_UNIT(decl_type) as Richard
> suggested in the review of its initial implementation.
> Unfortunately, because of bugs in both the C and C++ front ends
> (I just opened PR 65403 with the details) the simple formula
> doesn't give the right answers either.  So until the bugs are
> fixed, the patch reverts back to the original loopy solution.
> It's no more costly than the current fold_ctor_reference
> approach.
Jeff Law Sept. 9, 2019, 5:56 p.m. UTC | #2
On 9/6/19 5:26 PM, Martin Sebor wrote:
> Just a heads up that I tested the patch with Glibc and the kernel.
> It exposes some of the same "abuses" of (near) zero-length arrays
> as the most recent improvement in this area.  In glibc, it
> complains about code in fileops.c, iofwide.c, libc-tls.c, and
> rtld.c.  The ones I looked at all look like the last one we saw.
> I'll look into how to deal with them next.  In the kernel it
> issues a variety of warnings that I need to investigate after
> I get back from Cauldron.
THanks.  Good to know.

jeff
Jeff Law Sept. 10, 2019, 10:35 p.m. UTC | #3
On 9/6/19 1:27 PM, Martin Sebor wrote:
> Recent enhancements to -Wstringop-overflow improved the warning
> to the point that it detects a superset of the problems -Warray-
> bounds is intended detect in character accesses.  Because both
> warnings detect overlapping sets of problems, and because the IL
> they work with tends to change in subtle ways from target to
> targer, tests designed to verify one or the other sometimes fail
> with a target where the warning isn't robust enough to detect
> the problem given the IL representation.
> 
> To reduce these test suite failures the attached patch extends
> -Warray-bounds to handle some of the same problems -Wstringop-
> overflow does, pecifically, out-of-bounds accesses to array
> members of structs, including zero-length arrays and flexible
> array members of defined objects.
> 
> In the process of testing the enhancement I realized that
> the recently added component_size() function doesn't work as
> intended for non-character array members (see below).  The patch
> corrects this by reverting back to the original implementation
> of the function until the better/simpler solution can be put in
> place as mentioned below.
> 
> Tested on x86_64-linux.
> 
> Martin
> 
> 
> [*] component_size() happens to work for char arrays because those
> are transformed to STRING_CSTs, but not for arrays that are not.
> E.g., in
> 
>   struct S { int64_t i; int16_t j; int16_t a[]; }
>     s = { 0, 0, { 1, 0 } };
> 
> unless called with type set to int16_t[2], fold_ctor_reference
> will return s.a[0] rather than all of s.a.  But set type to
> int16_t[2] we would need to know that s.a's initializer has two
> elements, and that's just what we're using fold_ctor_reference
> to find out.
> 
> I think this could probably be made to work somehow by extending
> useless_type_conversion_p to handle this case as special somehow,
> but it doesn't seem worth the effort given that there should be
> an easier way to do it as you noted below.
> 
> Given the above, the long term solution should be to rely on
> DECL_SIZE_UNIT(decl) - TYPE_SIZE_UNIT(decl_type) as Richard
> suggested in the review of its initial implementation.
> Unfortunately, because of bugs in both the C and C++ front ends
> (I just opened PR 65403 with the details) the simple formula
> doesn't give the right answers either.  So until the bugs are
> fixed, the patch reverts back to the original loopy solution.
> It's no more costly than the current fold_ctor_reference
> approach.
> 
> gcc-91647.diff
> 
> PR middle-end/91679 - missing -Warray-bounds accessing a member array in a local buffer
> PR middle-end/91647 - new FAILs for Warray-bounds-8 and Wstringop-overflow-3.C
> PR middle-end/91463 - missing -Warray-bounds accessing past the end of a statically initialized flexible array member
> 
> gcc/ChangeLog:
> 
> 	PR middle-end/91679
> 	PR middle-end/91647
> 	PR middle-end/91463
> 	* builtins.c (component_size): Rename to component_ref_size and move...
> 	(compute_objsize): Adjust to callee name change.
> 	* tree-vrp.c (vrp_prop::check_array_ref): Handle trailing arrays with
> 	initializers.
> 	(vrp_prop::check_mem_ref): Handle declared struct objects.
> 	* tree.c (last_field): New function.
> 	(array_at_struct_end_p): Handle MEM_REF.
> 	(get_initializer_for, get_flexarray_size): New helpers.
> 	(component_ref_size): ...move here from builtins.c.  Make extern.
> 	Use get_flexarray_size instead of fold_ctor_reference.
> 	* tree.h (component_ref_size): Declare.
> 	* wide-int.h (generic_wide_int <storage>::sign_mask): Assert invariant.
> 
> gcc/testsuite/ChangeLog:
> 
> 	PR middle-end/91679
> 	PR middle-end/91647
> 	PR middle-end/91463
> 	* c-c++-common/Warray-bounds-2.c: Disable VRP.  Adjust expected messages.
> 	* gcc.dg/Warray-bounds-44.c: New test.
> 	* gcc.dg/Warray-bounds-45.c: New test.
> 	* gcc.dg/Wstringop-overflow-16.c: Adjust text of expected messages.
> 	* gcc.dg/pr36902.c: Remove xfail.
> 	* gcc.dg/strlenopt-57.c: Add an expected warning.
> 

> Index: gcc/tree.c
> ===================================================================
> --- gcc/tree.c	(revision 275387)
> +++ gcc/tree.c	(working copy)
> @@ -13860,6 +13902,134 @@ component_ref_field_offset (tree exp)
[ ... ]
> +  /* If the flexible array member has an known size use the greater
> +     of it and the tail padding in the enclosing struct.
> +     Otherwise, when the size of the flexible array member is unknown
> +     and the referenced object is not a struct, use the size of its
> +     type when known.  This detects sizes of array buffers when cast
> +     to struct tyoes with flexible array members.  */
s/tyoes/types/

So no concerns with the patch itself, just the fallout you mentioned in
a follow-up message.  Ideally we'd have glibc and the kernel fixed
before this goes in, but I'd settle for just getting glibc fixed since
we have more influence there.

Out of curiosity are the kernel issues you raised due to flexible arrays
or just cases where we're doing a better job on normal objects?  I'd be
a bit surprised to find flexible arrays in the kernel.

jeff
Martin Sebor Oct. 11, 2019, 4:34 p.m. UTC | #4
On 9/10/19 4:35 PM, Jeff Law wrote:
> On 9/6/19 1:27 PM, Martin Sebor wrote:
>> Recent enhancements to -Wstringop-overflow improved the warning
>> to the point that it detects a superset of the problems -Warray-
>> bounds is intended detect in character accesses.  Because both
>> warnings detect overlapping sets of problems, and because the IL
>> they work with tends to change in subtle ways from target to
>> targer, tests designed to verify one or the other sometimes fail
>> with a target where the warning isn't robust enough to detect
>> the problem given the IL representation.
>>
>> To reduce these test suite failures the attached patch extends
>> -Warray-bounds to handle some of the same problems -Wstringop-
>> overflow does, pecifically, out-of-bounds accesses to array
>> members of structs, including zero-length arrays and flexible
>> array members of defined objects.
>>
>> In the process of testing the enhancement I realized that
>> the recently added component_size() function doesn't work as
>> intended for non-character array members (see below).  The patch
>> corrects this by reverting back to the original implementation
>> of the function until the better/simpler solution can be put in
>> place as mentioned below.
>>
>> Tested on x86_64-linux.
>>
>> Martin
>>
>>
>> [*] component_size() happens to work for char arrays because those
>> are transformed to STRING_CSTs, but not for arrays that are not.
>> E.g., in
>>
>>    struct S { int64_t i; int16_t j; int16_t a[]; }
>>      s = { 0, 0, { 1, 0 } };
>>
>> unless called with type set to int16_t[2], fold_ctor_reference
>> will return s.a[0] rather than all of s.a.  But set type to
>> int16_t[2] we would need to know that s.a's initializer has two
>> elements, and that's just what we're using fold_ctor_reference
>> to find out.
>>
>> I think this could probably be made to work somehow by extending
>> useless_type_conversion_p to handle this case as special somehow,
>> but it doesn't seem worth the effort given that there should be
>> an easier way to do it as you noted below.
>>
>> Given the above, the long term solution should be to rely on
>> DECL_SIZE_UNIT(decl) - TYPE_SIZE_UNIT(decl_type) as Richard
>> suggested in the review of its initial implementation.
>> Unfortunately, because of bugs in both the C and C++ front ends
>> (I just opened PR 65403 with the details) the simple formula
>> doesn't give the right answers either.  So until the bugs are
>> fixed, the patch reverts back to the original loopy solution.
>> It's no more costly than the current fold_ctor_reference
>> approach.
...
> 
> So no concerns with the patch itself, just the fallout you mentioned in
> a follow-up message.  Ideally we'd have glibc and the kernel fixed
> before this goes in, but I'd settle for just getting glibc fixed since
> we have more influence there.

Half of the issues there were due to a bug in the warning.  The rest
are caused by Glibc's use of interior zero-length arrays to access
subsequent members.  It works in simple cases but it's very brittle
because GCC assumes that even such members don't alias. If it's meant
to be a supported feature then aliasing would have to be changed to
take it into account.  But I'd rather encourage projects to move away
from these dangerous hacks and towards cleaner, safer code.

I've fixed the bug in the attached patch.  The rest can be suppressed
by replacing the zero-length arrays with flexible array members but
that's just trading one misuse for another.  If the code can't be
changed to avoid this (likely not an option since the arrays are in
visible in the public API) I think the best way to deal with them is
to suppress them by #pragma GCC diagnostic ignored.  I opened BZ 25097
in Glibc Bugzilla to track this.

> Out of curiosity are the kernel issues you raised due to flexible arrays
> or just cases where we're doing a better job on normal objects?  I'd be
> a bit surprised to find flexible arrays in the kernel.

I don't think I've come across any flexible arrays in the kernel.

The patch triggers 94 instances of -Warray-bounds (60 of which
are for distinct code) in 21 .c files.  I haven't looked at all
of them but some of the patterns I noticed are:

1) Intentionally using an interior zero-length array to access
    (e.g., memset) one or more subsequent members. E.g.,
    _dbgp_external_startup in drivers/usb/early/ehci-dbgp.c and
    quite a few others.  This is pretty pervasive but seems easily
    avoidable.

2) Overwriting a member array with more data (e.g., function
    cxio_rdev_open in
    drivers/infiniband/hw/cxgb3/cxio_hal.c or in function
    pk_probe in drivers/hid/hid-prodikeys.c).  At first glance
    some of these look like bugs but with stuff obscured by macros
    and no comments it's hard to tell.

3) Uses of the container_of() macro to access one member given
    the address of another.  This is undefined (and again breaks
    the aliasing rules) but the macro is used all over the place
    in the kernel.  I count over 15,000 references to it.

4) Uses of one-element arrays as members of other one-element
    arrays (in include/scsi/fc/fc_ms.h).  Was this ever meant
    to be supported by GCC?  (It isn't by _FORTIFY_SOURCE=2.)

5) Possible false positives due to the recent loop unrolling
    change.

It will be a quite a bit of work to clean this up.  To make it
easier we would introduce a new option to control the warning
for some of the most common idioms, such as
-Wzero-length-array-bounds.  I'm not too wild about this because
it would just paper over the problem.  A better solution would
also involve avoiding the aliasing assumptions for overlapping
zero-length member arrays.

Anyway, attached is the updated patch with just the one fix
I mentioned above, retested on x86_64-linux.

Martin
Martin Sebor Oct. 24, 2019, 2:46 p.m. UTC | #5
Ping: https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00860.html

Should I add something like the -Wzero-length-array-bounds option
to allow some of the questionable idioms seen in the kernel?

On 10/11/2019 10:34 AM, Martin Sebor wrote:
> On 9/10/19 4:35 PM, Jeff Law wrote:
>> On 9/6/19 1:27 PM, Martin Sebor wrote:
>>> Recent enhancements to -Wstringop-overflow improved the warning
>>> to the point that it detects a superset of the problems -Warray-
>>> bounds is intended detect in character accesses.  Because both
>>> warnings detect overlapping sets of problems, and because the IL
>>> they work with tends to change in subtle ways from target to
>>> targer, tests designed to verify one or the other sometimes fail
>>> with a target where the warning isn't robust enough to detect
>>> the problem given the IL representation.
>>>
>>> To reduce these test suite failures the attached patch extends
>>> -Warray-bounds to handle some of the same problems -Wstringop-
>>> overflow does, pecifically, out-of-bounds accesses to array
>>> members of structs, including zero-length arrays and flexible
>>> array members of defined objects.
>>>
>>> In the process of testing the enhancement I realized that
>>> the recently added component_size() function doesn't work as
>>> intended for non-character array members (see below).  The patch
>>> corrects this by reverting back to the original implementation
>>> of the function until the better/simpler solution can be put in
>>> place as mentioned below.
>>>
>>> Tested on x86_64-linux.
>>>
>>> Martin
>>>
>>>
>>> [*] component_size() happens to work for char arrays because those
>>> are transformed to STRING_CSTs, but not for arrays that are not.
>>> E.g., in
>>>
>>>    struct S { int64_t i; int16_t j; int16_t a[]; }
>>>      s = { 0, 0, { 1, 0 } };
>>>
>>> unless called with type set to int16_t[2], fold_ctor_reference
>>> will return s.a[0] rather than all of s.a.  But set type to
>>> int16_t[2] we would need to know that s.a's initializer has two
>>> elements, and that's just what we're using fold_ctor_reference
>>> to find out.
>>>
>>> I think this could probably be made to work somehow by extending
>>> useless_type_conversion_p to handle this case as special somehow,
>>> but it doesn't seem worth the effort given that there should be
>>> an easier way to do it as you noted below.
>>>
>>> Given the above, the long term solution should be to rely on
>>> DECL_SIZE_UNIT(decl) - TYPE_SIZE_UNIT(decl_type) as Richard
>>> suggested in the review of its initial implementation.
>>> Unfortunately, because of bugs in both the C and C++ front ends
>>> (I just opened PR 65403 with the details) the simple formula
>>> doesn't give the right answers either.  So until the bugs are
>>> fixed, the patch reverts back to the original loopy solution.
>>> It's no more costly than the current fold_ctor_reference
>>> approach.
> ...
>>
>> So no concerns with the patch itself, just the fallout you mentioned in
>> a follow-up message.  Ideally we'd have glibc and the kernel fixed
>> before this goes in, but I'd settle for just getting glibc fixed since
>> we have more influence there.
> 
> Half of the issues there were due to a bug in the warning.  The rest
> are caused by Glibc's use of interior zero-length arrays to access
> subsequent members.  It works in simple cases but it's very brittle
> because GCC assumes that even such members don't alias. If it's meant
> to be a supported feature then aliasing would have to be changed to
> take it into account.  But I'd rather encourage projects to move away
> from these dangerous hacks and towards cleaner, safer code.
> 
> I've fixed the bug in the attached patch.  The rest can be suppressed
> by replacing the zero-length arrays with flexible array members but
> that's just trading one misuse for another.  If the code can't be
> changed to avoid this (likely not an option since the arrays are in
> visible in the public API) I think the best way to deal with them is
> to suppress them by #pragma GCC diagnostic ignored.  I opened BZ 25097
> in Glibc Bugzilla to track this.
> 
>> Out of curiosity are the kernel issues you raised due to flexible arrays
>> or just cases where we're doing a better job on normal objects?  I'd be
>> a bit surprised to find flexible arrays in the kernel.
> 
> I don't think I've come across any flexible arrays in the kernel.
> 
> The patch triggers 94 instances of -Warray-bounds (60 of which
> are for distinct code) in 21 .c files.  I haven't looked at all
> of them but some of the patterns I noticed are:
> 
> 1) Intentionally using an interior zero-length array to access
>     (e.g., memset) one or more subsequent members. E.g.,
>     _dbgp_external_startup in drivers/usb/early/ehci-dbgp.c and
>     quite a few others.  This is pretty pervasive but seems easily
>     avoidable.
> 
> 2) Overwriting a member array with more data (e.g., function
>     cxio_rdev_open in
>     drivers/infiniband/hw/cxgb3/cxio_hal.c or in function
>     pk_probe in drivers/hid/hid-prodikeys.c).  At first glance
>     some of these look like bugs but with stuff obscured by macros
>     and no comments it's hard to tell.
> 
> 3) Uses of the container_of() macro to access one member given
>     the address of another.  This is undefined (and again breaks
>     the aliasing rules) but the macro is used all over the place
>     in the kernel.  I count over 15,000 references to it.
> 
> 4) Uses of one-element arrays as members of other one-element
>     arrays (in include/scsi/fc/fc_ms.h).  Was this ever meant
>     to be supported by GCC?  (It isn't by _FORTIFY_SOURCE=2.)
> 
> 5) Possible false positives due to the recent loop unrolling
>     change.
> 
> It will be a quite a bit of work to clean this up.  To make it
> easier we would introduce a new option to control the warning
> for some of the most common idioms, such as
> -Wzero-length-array-bounds.  I'm not too wild about this because
> it would just paper over the problem.  A better solution would
> also involve avoiding the aliasing assumptions for overlapping
> zero-length member arrays.
> 
> Anyway, attached is the updated patch with just the one fix
> I mentioned above, retested on x86_64-linux.
> 
> Martin
Jeff Law Oct. 31, 2019, 6:54 p.m. UTC | #6
On 10/11/19 10:34 AM, Martin Sebor wrote:
> I've fixed the bug in the attached patch.  The rest can be suppressed
> by replacing the zero-length arrays with flexible array members but
> that's just trading one misuse for another.  If the code can't be
> changed to avoid this (likely not an option since the arrays are in
> visible in the public API) I think the best way to deal with them is
> to suppress them by #pragma GCC diagnostic ignored.  I opened BZ 25097
> in Glibc Bugzilla to track this.
While it's trading one misuse for another, one could argue that a
flexible array is less of a misuse :-)

> 
> The patch triggers 94 instances of -Warray-bounds (60 of which
> are for distinct code) in 21 .c files.  I haven't looked at all
> of them but some of the patterns I noticed are:
> 
> 1) Intentionally using an interior zero-length array to access
>    (e.g., memset) one or more subsequent members. E.g.,
>    _dbgp_external_startup in drivers/usb/early/ehci-dbgp.c and
>    quite a few others.  This is pretty pervasive but seems easily
>    avoidable.
Yea, I saw something like this as well.  I can't recall if it was in the
kernel or a package that was reading structured data of some sort.
THey'd use the zero length array to access the entire blob-o-data and
subsequent fields when they wanted to access particular chunks of data
that were at well known offsets from the start.


> 
> 2) Overwriting a member array with more data (e.g., function
>    cxio_rdev_open in
>    drivers/infiniband/hw/cxgb3/cxio_hal.c or in function
>    pk_probe in drivers/hid/hid-prodikeys.c).  At first glance
>    some of these look like bugs but with stuff obscured by macros
>    and no comments it's hard to tell.
ACK.

> 
> 3) Uses of the container_of() macro to access one member given
>    the address of another.  This is undefined (and again breaks
>    the aliasing rules) but the macro is used all over the place
>    in the kernel.  I count over 15,000 references to it.
Ugh.

> 
> 4) Uses of one-element arrays as members of other one-element
>    arrays (in include/scsi/fc/fc_ms.h).  Was this ever meant
>    to be supported by GCC?  (It isn't by _FORTIFY_SOURCE=2.)
> 
> 5) Possible false positives due to the recent loop unrolling
>    change.
I think I've worked around some #5 issues as well.  It's got an
aassociated BZ and hopefully will be addressed during stage3/stage4.

> 
> It will be a quite a bit of work to clean this up.  To make it
> easier we would introduce a new option to control the warning
> for some of the most common idioms, such as
> -Wzero-length-array-bounds.  I'm not too wild about this because
> it would just paper over the problem.  A better solution would
> also involve avoiding the aliasing assumptions for overlapping
> zero-length member arrays.
> 
> Anyway, attached is the updated patch with just the one fix
> I mentioned above, retested on x86_64-linux.
I think the agreement in our meeting yesterday was to give some kind of
knob, particularly for the kernel folks.


> 
> Martin
> 
> gcc-91647.diff
> 
> PR middle-end/91679 - missing -Warray-bounds accessing a member array in a local buffer
> PR middle-end/91647 - new FAILs for Warray-bounds-8 and Wstringop-overflow-3.C
> PR middle-end/91463 - missing -Warray-bounds accessing past the end of a statically initialized flexible array member
> 
> gcc/ChangeLog:
> 
> 	PR middle-end/91679
> 	PR middle-end/91647
> 	PR middle-end/91463
> 	* tree-vrp.c (vrp_prop::check_array_ref): Handle trailing arrays with
> 	initializers.
> 	(vrp_prop::check_mem_ref): Handle declared struct objects.
> 	* tree.c (last_field): New function.
> 	(array_at_struct_end_p): Handle MEM_REF.
> 	(get_initializer_for): New helper.
> 	(component_ref_size): Rename locals.  Call get_initializer_for instead
> 	of fold_ctor_reference.  Correct handling of flexible array members. 
> 	* wide-int.h (generic_wide_int <storage>::sign_mask): Assert invariant.
> 
> gcc/testsuite/ChangeLog:
> 
> 	PR middle-end/91679
> 	PR middle-end/91647
> 	PR middle-end/91463
> 	* c-c++-common/Warray-bounds-2.c: Disable VRP.  Adjust expected messages.
> 	* gcc.dg/Warray-bounds-48.c: New test.
> 	* gcc.dg/Warray-bounds-49.c: New test.
> 	* gcc.dg/Wstringop-overflow-16.c: Adjust text of expected messages.
> 	* gcc.dg/pr36902.c: Remove xfail.
> 	* gcc.dg/strlenopt-57.c: Add an expected warning.
So per our meeting yesterday, add a knob for the kernel folks to be able
to control this and it's OK.

jeff
Martin Sebor Nov. 1, 2019, 9:09 p.m. UTC | #7
On 10/31/19 12:54 PM, Jeff Law wrote:
> On 10/11/19 10:34 AM, Martin Sebor wrote:
>> I've fixed the bug in the attached patch.  The rest can be suppressed
>> by replacing the zero-length arrays with flexible array members but
>> that's just trading one misuse for another.  If the code can't be
>> changed to avoid this (likely not an option since the arrays are in
>> visible in the public API) I think the best way to deal with them is
>> to suppress them by #pragma GCC diagnostic ignored.  I opened BZ 25097
>> in Glibc Bugzilla to track this.
> While it's trading one misuse for another, one could argue that a
> flexible array is less of a misuse :-)
> 
>>
>> The patch triggers 94 instances of -Warray-bounds (60 of which
>> are for distinct code) in 21 .c files.  I haven't looked at all
>> of them but some of the patterns I noticed are:
>>
>> 1) Intentionally using an interior zero-length array to access
>>     (e.g., memset) one or more subsequent members. E.g.,
>>     _dbgp_external_startup in drivers/usb/early/ehci-dbgp.c and
>>     quite a few others.  This is pretty pervasive but seems easily
>>     avoidable.
> Yea, I saw something like this as well.  I can't recall if it was in the
> kernel or a package that was reading structured data of some sort.
> THey'd use the zero length array to access the entire blob-o-data and
> subsequent fields when they wanted to access particular chunks of data
> that were at well known offsets from the start.
> 
> 
>>
>> 2) Overwriting a member array with more data (e.g., function
>>     cxio_rdev_open in
>>     drivers/infiniband/hw/cxgb3/cxio_hal.c or in function
>>     pk_probe in drivers/hid/hid-prodikeys.c).  At first glance
>>     some of these look like bugs but with stuff obscured by macros
>>     and no comments it's hard to tell.
> ACK.
> 
>>
>> 3) Uses of the container_of() macro to access one member given
>>     the address of another.  This is undefined (and again breaks
>>     the aliasing rules) but the macro is used all over the place
>>     in the kernel.  I count over 15,000 references to it.
> Ugh.
> 
>>
>> 4) Uses of one-element arrays as members of other one-element
>>     arrays (in include/scsi/fc/fc_ms.h).  Was this ever meant
>>     to be supported by GCC?  (It isn't by _FORTIFY_SOURCE=2.)
>>
>> 5) Possible false positives due to the recent loop unrolling
>>     change.
> I think I've worked around some #5 issues as well.  It's got an
> aassociated BZ and hopefully will be addressed during stage3/stage4.
> 
>>
>> It will be a quite a bit of work to clean this up.  To make it
>> easier we would introduce a new option to control the warning
>> for some of the most common idioms, such as
>> -Wzero-length-array-bounds.  I'm not too wild about this because
>> it would just paper over the problem.  A better solution would
>> also involve avoiding the aliasing assumptions for overlapping
>> zero-length member arrays.
>>
>> Anyway, attached is the updated patch with just the one fix
>> I mentioned above, retested on x86_64-linux.
> I think the agreement in our meeting yesterday was to give some kind of
> knob, particularly for the kernel folks.
> 
> 
>>
>> Martin
>>
>> gcc-91647.diff
>>
>> PR middle-end/91679 - missing -Warray-bounds accessing a member array in a local buffer
>> PR middle-end/91647 - new FAILs for Warray-bounds-8 and Wstringop-overflow-3.C
>> PR middle-end/91463 - missing -Warray-bounds accessing past the end of a statically initialized flexible array member
>>
>> gcc/ChangeLog:
>>
>> 	PR middle-end/91679
>> 	PR middle-end/91647
>> 	PR middle-end/91463
>> 	* tree-vrp.c (vrp_prop::check_array_ref): Handle trailing arrays with
>> 	initializers.
>> 	(vrp_prop::check_mem_ref): Handle declared struct objects.
>> 	* tree.c (last_field): New function.
>> 	(array_at_struct_end_p): Handle MEM_REF.
>> 	(get_initializer_for): New helper.
>> 	(component_ref_size): Rename locals.  Call get_initializer_for instead
>> 	of fold_ctor_reference.  Correct handling of flexible array members.
>> 	* wide-int.h (generic_wide_int <storage>::sign_mask): Assert invariant.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 	PR middle-end/91679
>> 	PR middle-end/91647
>> 	PR middle-end/91463
>> 	* c-c++-common/Warray-bounds-2.c: Disable VRP.  Adjust expected messages.
>> 	* gcc.dg/Warray-bounds-48.c: New test.
>> 	* gcc.dg/Warray-bounds-49.c: New test.
>> 	* gcc.dg/Wstringop-overflow-16.c: Adjust text of expected messages.
>> 	* gcc.dg/pr36902.c: Remove xfail.
>> 	* gcc.dg/strlenopt-57.c: Add an expected warning.
> So per our meeting yesterday, add a knob for the kernel folks to be able
> to control this and it's OK.

I've added a new option, -Wzero-length-bounds to control this.
It complicates the get_component_size function a bit but the rest
of the changes weren't intrusive.

To make a clear distinction between zero-length arrays and flexible
array members in dianostics I tweaked how the former are shown and
added the zero.  With that, a zero-length array is now rendered as
char[0] and not as char[] as it was before, and as flexible arrays
still are.

I also had to work around by using memcpy instead of a hand rolled
loop an apparent deficiency in loop unrolling (PR 92323) that
triggered a -Warray-bounds false positive in gimple-match-head.c.

Rebuilding the kernel with the updated patch results in the following
breakdown of the two warnings (the numbers are total instances of each,
unique instances, and files they come from):

   -Wzero-length-bounds                 49       46       13
   -Warray-bounds                       45       14        8

The -Warray-bounds instances I checked look legitimate even though
the code is some of them still looks benign.  I'm not sure there's
a good way to relax the warning to sanction some of these abuses
without also missing some bugs.  It might be worth looking into
some more in stage 3, depending on the fallout during mass rebuild.

After bootstrapping on x86_64 and i385 and regtesting I committed
the attached patch in r277728.

Martin
Jakub Jelinek Nov. 1, 2019, 11:24 p.m. UTC | #8
On Fri, Nov 01, 2019 at 03:09:39PM -0600, Martin Sebor wrote:
> 	* gcc.dg/pr36902.c: Remove xfail.

> --- a/gcc/testsuite/gcc.dg/pr36902.c
> +++ b/gcc/testsuite/gcc.dg/pr36902.c
> @@ -44,7 +44,7 @@ foo2(unsigned char * to, const unsigned char * from, int n)
>        *to = *from;
>        break;
>      case 5:
> -      to[4] = from [4]; /* { dg-warning "array subscript is above array bounds" "" { xfail *-*-* } } */
> +      to[4] = from [4]; /* { dg-warning "\\\[-Warray-bounds } */
>        break;
>      }
>    return to;

This FAILs:
+ERROR: gcc.dg/pr36902.c: missing " for " dg-warning 47 "\\\\\\[-Warray-bounds "
+ERROR: gcc.dg/pr36902.c: missing " for " dg-warning 47 "\\\\\\[-Warray-bounds "
Fixed thusly, regtested on x86_64-linux and i686-linux, committed to trunk
as obvious:

2019-11-02  Jakub Jelinek  <jakub@redhat.com>

	* gcc.dg/pr36902.c: Terminate dg-warning regexp string.

--- gcc/testsuite/gcc.dg/pr36902.c.jj	2019-11-01 22:19:48.757844885 +0100
+++ gcc/testsuite/gcc.dg/pr36902.c	2019-11-02 00:21:00.556117852 +0100
@@ -44,7 +44,7 @@ foo2(unsigned char * to, const unsigned
       *to = *from;
       break;
     case 5:
-      to[4] = from [4]; /* { dg-warning "\\\[-Warray-bounds } */
+      to[4] = from [4]; /* { dg-warning "\\\[-Warray-bounds" } */
       break;
     }
   return to;


	Jakub
Maciej W. Rozycki Nov. 6, 2019, 11:09 p.m. UTC | #9
On Fri, 1 Nov 2019, Martin Sebor wrote:

> Rebuilding the kernel with the updated patch results in the following
> breakdown of the two warnings (the numbers are total instances of each,
> unique instances, and files they come from):
> 
>    -Wzero-length-bounds                 49       46       13
>    -Warray-bounds                       45       14        8
> 
> The -Warray-bounds instances I checked look legitimate even though
> the code is some of them still looks benign.  I'm not sure there's
> a good way to relax the warning to sanction some of these abuses
> without also missing some bugs.  It might be worth looking into
> some more in stage 3, depending on the fallout during mass rebuild.
> 
> After bootstrapping on x86_64 and i385 and regtesting I committed
> the attached patch in r277728.

 It is what I believe has also broken glibc:

In file included from ../sysdeps/riscv/libc-tls.c:19:
../csu/libc-tls.c: In function '__libc_setup_tls':
../csu/libc-tls.c:209:30: error: array subscript 1 is outside the bounds of an interior zero-length array 'struct dtv_slotinfo[0]' [-Werror=zero-length-bounds]
  209 |   static_slotinfo.si.slotinfo[1].map = main_map;
      |   ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
In file included from ../sysdeps/riscv/ldsodefs.h:46,
                 from ../sysdeps/gnu/ldsodefs.h:46,
                 from ../sysdeps/unix/sysv/linux/ldsodefs.h:25,
                 from ../sysdeps/unix/sysv/linux/riscv/ldsodefs.h:22,
                 from ../csu/libc-tls.c:21,
                 from ../sysdeps/riscv/libc-tls.c:19:
../sysdeps/generic/ldsodefs.h:423:7: note: while referencing 'slotinfo'
  423 |     } slotinfo[0];
      |       ^~~~~~~~
cc1: all warnings being treated as errors

(here in a RISC-V build).

 Has anybody looked yet into how the breakage could possibly be addressed?

  Maciej
Jeff Law Nov. 6, 2019, 11:25 p.m. UTC | #10
On 11/6/19 4:09 PM, Maciej W. Rozycki wrote:
> On Fri, 1 Nov 2019, Martin Sebor wrote:
> 
>> Rebuilding the kernel with the updated patch results in the following
>> breakdown of the two warnings (the numbers are total instances of each,
>> unique instances, and files they come from):
>>
>>    -Wzero-length-bounds                 49       46       13
>>    -Warray-bounds                       45       14        8
>>
>> The -Warray-bounds instances I checked look legitimate even though
>> the code is some of them still looks benign.  I'm not sure there's
>> a good way to relax the warning to sanction some of these abuses
>> without also missing some bugs.  It might be worth looking into
>> some more in stage 3, depending on the fallout during mass rebuild.
>>
>> After bootstrapping on x86_64 and i385 and regtesting I committed
>> the attached patch in r277728.
> 
>  It is what I believe has also broken glibc:
> 
> In file included from ../sysdeps/riscv/libc-tls.c:19:
> ../csu/libc-tls.c: In function '__libc_setup_tls':
> ../csu/libc-tls.c:209:30: error: array subscript 1 is outside the bounds of an interior zero-length array 'struct dtv_slotinfo[0]' [-Werror=zero-length-bounds]
>   209 |   static_slotinfo.si.slotinfo[1].map = main_map;
>       |   ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
> In file included from ../sysdeps/riscv/ldsodefs.h:46,
>                  from ../sysdeps/gnu/ldsodefs.h:46,
>                  from ../sysdeps/unix/sysv/linux/ldsodefs.h:25,
>                  from ../sysdeps/unix/sysv/linux/riscv/ldsodefs.h:22,
>                  from ../csu/libc-tls.c:21,
>                  from ../sysdeps/riscv/libc-tls.c:19:
> ../sysdeps/generic/ldsodefs.h:423:7: note: while referencing 'slotinfo'
>   423 |     } slotinfo[0];
>       |       ^~~~~~~~
> cc1: all warnings being treated as errors
> 
> (here in a RISC-V build).
> 
>  Has anybody looked yet into how the breakage could possibly be addressed?
Yea, Florian posted patches over the weekend to fix glibc.  They're
still going through the review/update cycle.

jeff
Maciej W. Rozycki Nov. 7, 2019, 12:02 a.m. UTC | #11
On Wed, 6 Nov 2019, Jeff Law wrote:

> >  It is what I believe has also broken glibc:
> > 
> > In file included from ../sysdeps/riscv/libc-tls.c:19:
> > ../csu/libc-tls.c: In function '__libc_setup_tls':
> > ../csu/libc-tls.c:209:30: error: array subscript 1 is outside the bounds of an interior zero-length array 'struct dtv_slotinfo[0]' [-Werror=zero-length-bounds]
> >   209 |   static_slotinfo.si.slotinfo[1].map = main_map;
> >       |   ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
> > In file included from ../sysdeps/riscv/ldsodefs.h:46,
> >                  from ../sysdeps/gnu/ldsodefs.h:46,
> >                  from ../sysdeps/unix/sysv/linux/ldsodefs.h:25,
> >                  from ../sysdeps/unix/sysv/linux/riscv/ldsodefs.h:22,
> >                  from ../csu/libc-tls.c:21,
> >                  from ../sysdeps/riscv/libc-tls.c:19:
> > ../sysdeps/generic/ldsodefs.h:423:7: note: while referencing 'slotinfo'
> >   423 |     } slotinfo[0];
> >       |       ^~~~~~~~
> > cc1: all warnings being treated as errors
> > 
> > (here in a RISC-V build).
> > 
> >  Has anybody looked yet into how the breakage could possibly be addressed?
> Yea, Florian posted patches over the weekend to fix glibc.  They're
> still going through the review/update cycle.

 Thanks, I have found them now, now that I knew what to look for and in 
what time frame.

 Unfortunately there's no mention of the error message or at least the 
name of the `-Wzero-length-bounds' option (which is how I found the GCC 
patch) in the respective glibc change descriptions so my mailing list 
searches returned nothing.  I think it would be good to try and have 
keywords potentially looked for in change descriptions, and verbatim error 
messages are certainly good candidates IMO.

 So I went for `-Wno-zero-length-bounds' for my glibc build for the time 
being, as my objective now is to get some outstanding GCC stuff in before 
stage 1 ends rather than being drawn into glibc build issues.

  Maciej
Matthew Malcomson Dec. 9, 2019, 4:11 p.m. UTC | #12
On 01/11/2019 21:09, Martin Sebor wrote:
> diff --git a/gcc/gimple-match-head.c b/gcc/gimple-match-head.c
> index 53278168a59..d7c74a1865a 100644
> --- a/gcc/gimple-match-head.c
> +++ b/gcc/gimple-match-head.c
> @@ -837,8 +837,8 @@ try_conditional_simplification (internal_fn ifn, gimple_match_op *res_op,
>     gimple_match_op cond_op (gimple_match_cond (res_op->ops[0],
>   					      res_op->ops[num_ops - 1]),
>   			   op, res_op->type, num_ops - 2);
> -  for (unsigned int i = 1; i < num_ops - 1; ++i)
> -    cond_op.ops[i - 1] = res_op->ops[i];
> +
> +  memcpy (cond_op.ops, res_op->ops + 1, (num_ops - 1) * sizeof *cond_op.ops);
>     switch (num_ops - 2)
>       {
>       case 2:

I think this copies one extra element than the original code.

(copying `num_ops - 1` elements, while the previous loop only copied 
`num_ops - 2` elements since the counter started at 1).
Martin Sebor Dec. 9, 2019, 5:25 p.m. UTC | #13
On 12/9/19 9:11 AM, Matthew Malcomson wrote:
> On 01/11/2019 21:09, Martin Sebor wrote:
>> diff --git a/gcc/gimple-match-head.c b/gcc/gimple-match-head.c
>> index 53278168a59..d7c74a1865a 100644
>> --- a/gcc/gimple-match-head.c
>> +++ b/gcc/gimple-match-head.c
>> @@ -837,8 +837,8 @@ try_conditional_simplification (internal_fn ifn, gimple_match_op *res_op,
>>      gimple_match_op cond_op (gimple_match_cond (res_op->ops[0],
>>    					      res_op->ops[num_ops - 1]),
>>    			   op, res_op->type, num_ops - 2);
>> -  for (unsigned int i = 1; i < num_ops - 1; ++i)
>> -    cond_op.ops[i - 1] = res_op->ops[i];
>> +
>> +  memcpy (cond_op.ops, res_op->ops + 1, (num_ops - 1) * sizeof *cond_op.ops);
>>      switch (num_ops - 2)
>>        {
>>        case 2:
> 
> I think this copies one extra element than the original code.
> 
> (copying `num_ops - 1` elements, while the previous loop only copied
> `num_ops - 2` elements since the counter started at 1).
> 

Yes, I think you're right.  I only noticed after I committed
the change, but didn't think it actually causes any problems
(i.e., it doesn't read past the end).  Let me know if you
think otherwise.

Martin
diff mbox series

Patch

PR middle-end/91679 - missing -Warray-bounds accessing a member array in a local buffer
PR middle-end/91647 - new FAILs for Warray-bounds-8 and Wstringop-overflow-3.C
PR middle-end/91463 - missing -Warray-bounds accessing past the end of a statically initialized flexible array member

gcc/ChangeLog:

	PR middle-end/91679
	PR middle-end/91647
	PR middle-end/91463
	* builtins.c (component_size): Rename to component_ref_size and move...
	(compute_objsize): Adjust to callee name change.
	* tree-vrp.c (vrp_prop::check_array_ref): Handle trailing arrays with
	initializers.
	(vrp_prop::check_mem_ref): Handle declared struct objects.
	* tree.c (last_field): New function.
	(array_at_struct_end_p): Handle MEM_REF.
	(get_initializer_for, get_flexarray_size): New helpers.
	(component_ref_size): ...move here from builtins.c.  Make extern.
	Use get_flexarray_size instead of fold_ctor_reference.
	* tree.h (component_ref_size): Declare.
	* wide-int.h (generic_wide_int <storage>::sign_mask): Assert invariant.

gcc/testsuite/ChangeLog:

	PR middle-end/91679
	PR middle-end/91647
	PR middle-end/91463
	* c-c++-common/Warray-bounds-2.c: Disable VRP.  Adjust expected messages.
	* gcc.dg/Warray-bounds-44.c: New test.
	* gcc.dg/Warray-bounds-45.c: New test.
	* gcc.dg/Wstringop-overflow-16.c: Adjust text of expected messages.
	* gcc.dg/pr36902.c: Remove xfail.
	* gcc.dg/strlenopt-57.c: Add an expected warning.

Index: gcc/builtins.c
===================================================================
--- gcc/builtins.c	(revision 275387)
+++ gcc/builtins.c	(working copy)
@@ -3562,54 +3562,6 @@  check_access (tree exp, tree, tree, tree dstwrite,
   return true;
 }
 
-/* Determines the size of the member referenced by the COMPONENT_REF
-   REF, using its initializer expression if necessary in order to
-   determine the size of an initialized flexible array member.
-   Returns the size (which might be zero for an object with
-   an uninitialized flexible array member) or null if the size
-   cannot be determined.  */
-
-static tree
-component_size (tree ref)
-{
-  gcc_assert (TREE_CODE (ref) == COMPONENT_REF);
-
-  tree member = TREE_OPERAND (ref, 1);
-
-  /* If the member is not last or has a size greater than one, return
-     it.  Otherwise it's either a flexible array member or a zero-length
-     array member, or an array of length one treated as such.  */
-  tree size = DECL_SIZE_UNIT (member);
-  if (size
-      && (!array_at_struct_end_p (ref)
-	  || (!integer_zerop (size)
-	      && !integer_onep (size))))
-    return size;
-
-  /* If the reference is to a declared object and the member a true
-     flexible array, try to determine its size from its initializer.  */
-  poly_int64 off = 0;
-  tree base = get_addr_base_and_unit_offset (ref, &off);
-  if (!base || !VAR_P (base))
-    return NULL_TREE;
-
-  /* The size of any member of a declared object other than a flexible
-     array member is that obtained above.  */
-  if (size)
-    return size;
-
-  if (tree init = DECL_INITIAL (base))
-    if (TREE_CODE (init) == CONSTRUCTOR)
-      {
-	off <<= LOG2_BITS_PER_UNIT;
-	init = fold_ctor_reference (NULL_TREE, init, off, 0, base);
-	if (init)
-	  return TYPE_SIZE_UNIT (TREE_TYPE (init));
-      }
-
-  return DECL_EXTERNAL (base) ? NULL_TREE : integer_zero_node;
-}
-
 /* Helper to compute the size of the object referenced by the DEST
    expression which must have pointer type, using Object Size type
    OSTYPE (only the least significant 2 bits are used).  Return
@@ -3744,7 +3696,7 @@  compute_objsize (tree dest, int ostype, tree *pdec
   if (TREE_CODE (dest) == COMPONENT_REF)
     {
       *pdecl = TREE_OPERAND (dest, 1);
-      return component_size (dest);
+      return component_ref_size (dest);
     }
 
   if (TREE_CODE (dest) != ADDR_EXPR)
Index: gcc/testsuite/c-c++-common/Warray-bounds-2.c
===================================================================
--- gcc/testsuite/c-c++-common/Warray-bounds-2.c	(revision 275387)
+++ gcc/testsuite/c-c++-common/Warray-bounds-2.c	(working copy)
@@ -6,7 +6,7 @@ 
    source of the excessive array bound is in a different function than
    the call.
    { dg-do compile }
-   { dg-options "-O2 -Warray-bounds -Wno-stringop-overflow" } */
+   { dg-options "-O2 -Warray-bounds -Wno-stringop-overflow -fno-tree-vrp" } */
 
 #if __has_include (<stddef.h>)
 #  include <stddef.h>
@@ -216,13 +216,13 @@  void call_strncpy_dst_diff_max (const char *s, siz
 static void
 wrap_strncpy_dstarray_diff_neg (char *d, const char *s, ptrdiff_t i, size_t n)
 {
-  strncpy (d + i, s, n);   /* { dg-bogus "offset -\[0-9\]+ is out of the bounds \\\[0, 90] of object .ar10. with type .(struct )?Array ?\\\[2]." "strncpy" } */
-}			   /* { dg-warning "array subscript -1 is outside array bounds" "" { target *-*-* } .-1 } */
+  strncpy (d + i, s, n);   /* { dg-warning "offset -\[0-9\]+ is out of the bounds \\\[0, 90] of object .ar10. with type .(struct )?Array ?\\\[2]." "strncpy" } */
+}
 
 void call_strncpy_dstarray_diff_neg (const char *s, size_t n)
 {
-  struct Array ar10[2];    /* { dg-bogus ".ar10. declared here" } */
-  sink (&ar10);		   /* { dg-message "while referencing" "" { target *-*-* } .-1 } */
+  struct Array ar10[2];    /* { dg-message ".ar10. declared here" } */
+  sink (&ar10);
 
   int off = (char*)ar10[1].a17 - (char*)ar10 + 1;
   wrap_strncpy_dstarray_diff_neg (ar10[1].a17, s, -off, n);
Index: gcc/testsuite/gcc.dg/Warray-bounds-44.c
===================================================================
--- gcc/testsuite/gcc.dg/Warray-bounds-44.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/Warray-bounds-44.c	(working copy)
@@ -0,0 +1,263 @@ 
+/* PR middle-end/91647 - missing -Warray-bounds accessing a zero-length array
+   of a declared object
+   { dg-do "compile" }
+   { dg-options "-O2 -Wall" } */
+
+typedef __INT16_TYPE__ int16_t;
+typedef __INT32_TYPE__ int32_t;
+
+void sink (void*);
+
+struct A0
+{
+  int32_t n;
+  int16_t a0[0];    // { dg-message "while referencing 'a0'" "member" { xfail *-*-* } }
+};
+
+static void warn_a0_local (struct A0 *p)
+{
+  p->a0[0] = 0;     // { dg-warning "\\\[-Warray-bounds" }
+  p->a0[1] = 1;     // { dg-warning "\\\[-Warray-bounds" }
+}
+
+static void warn_a0_extern (struct A0 *p)
+{
+  p->a0[0] = 0;     // { dg-warning "\\\[-Warray-bounds" }
+  p->a0[1] = 1;     // { dg-warning "\\\[-Warray-bounds" }
+}
+
+static void warn_a0_local_buf (struct A0 *p)
+{
+  p->a0[0] = 0; p->a0[1] = 1;
+
+  p->a0[2] = 2;     // { dg-warning "\\\[-Warray-bounds" }
+  p->a0[3] = 3;     // { dg-warning "\\\[-Warray-bounds" }
+  p->a0[4] = 4;     // { dg-warning "\\\[-Warray-bounds" }
+}
+
+static void warn_a0_extern_buf (struct A0 *p)
+{
+  p->a0[0] = 0; p->a0[1] = 1; p->a0[2] = 2;
+
+  p->a0[3] = 3;     // { dg-warning "\\\[-Warray-bounds" }
+  p->a0[4] = 4;     // { dg-warning "\\\[-Warray-bounds" }
+  p->a0[5] = 5;     // { dg-warning "\\\[-Warray-bounds" }
+}
+
+static void nowarn_a0_extern_bufx (struct A0 *p)
+{
+  p->a0[0] = 0; p->a0[99] = 99; p->a0[999] = 999; p->a0[9999] = 9999;
+}
+
+static void nowarn_a0_ref (struct A0 *p)
+{
+  p->a0[0] = 0; p->a0[99] = 99; p->a0[999] = 999; p->a0[9999] = 9999;
+}
+
+void test_a0 (struct A0 *p, unsigned n)
+{
+  /* FIXME: The messages issued in warn_a0 are actually:
+       warning: array subscript 2 is outside array bounds of 'struct A0[1]'
+       note: while referencing 's0'
+     rather than the expected
+       array subscript 0 is above array bounds of 'short int[0]'
+       note: while referencing 'aa0'
+  */
+  {
+    struct A0 sa0;  // { dg-message "while referencing 'sa0'" "struct" }
+    warn_a0_local (&sa0);
+    sink (&sa0);
+  }
+
+  {
+    extern
+      struct A0 xsa0;  // { dg-message "while referencing 'xsa0'" "struct" }
+    warn_a0_extern (&xsa0);
+    sink (&xsa0);
+  }
+
+  {
+    /* Verify out-of-bounds access to the local BUF is diagnosed.  */
+    char buf_p2[sizeof (struct A0) + 2 * sizeof (int16_t)];
+    warn_a0_local_buf ((struct A0*) buf_p2);
+    sink (buf_p2);
+  }
+
+  {
+    /* Verify out-of-bounds access to the extern BUF with a known
+       bound is diagnosed.  */
+    extern char a0_buf_p3[sizeof (struct A0) + 3 * sizeof (int16_t)];
+    warn_a0_extern_buf ((struct A0*) a0_buf_p3);
+    sink (a0_buf_p3);
+  }
+
+  {
+    /* Verify that accesses to BUFX with an unknown bound are not
+       diagnosed.  */
+    extern char bufx[];
+    nowarn_a0_extern_bufx ((struct A0*) bufx);
+    sink (bufx);
+  }
+
+  {
+    /* Verify that accesses to BUFN with a runtime bound are not
+       diagnosed.  */
+    char bufn[n];
+    nowarn_a0_extern_bufx ((struct A0*) bufn);
+    sink (bufn);
+  }
+
+  nowarn_a0_ref (p);
+}
+
+
+struct A1
+{
+  int32_t n;
+  int16_t a1[1];    // { dg-message "while referencing 'a1'" }
+};
+
+static void warn_a1_local_noinit (struct A1 *p)
+{
+  p->a1[0] = 0;
+  p->a1[1] = 1;     // { dg-warning "\\\[-Warray-bounds" }
+  p->a1[2] = 2;     // { dg-warning "\\\[-Warray-bounds" }
+}
+
+static void warn_a1_extern (struct A1 *p)
+{
+  p->a1[0] = 0;
+  p->a1[1] = 1;     // { dg-warning "\\\[-Warray-bounds" }
+  p->a1[2] = 2;     // { dg-warning "\\\[-Warray-bounds" }
+}
+
+static void warn_a1_init (struct A1 *p)
+{
+  p->a1[0] = 0;
+  p->a1[1] = 1;     // { dg-warning "\\\[-Warray-bounds" }
+  p->a1[2] = 2;     // { dg-warning "\\\[-Warray-bounds" }
+}
+
+static void warn_a1_local_buf (struct A1 *p)
+{
+  p->a1[0] = 0; p->a1[1] = 1; p->a1[2] = 2; p->a1[3] = 3;
+
+  p->a1[4] = 4;     // { dg-warning "\\\[-Warray-bounds" }
+}
+
+static void warn_a1_extern_buf (struct A1 *p)
+{
+  p->a1[0] = 0; p->a1[1] = 1; p->a1[2] = 2; p->a1[3] = 3; p->a1[4] = 4;
+
+  p->a1[5] = 5;     // { dg-warning "\\\[-Warray-bounds" }
+}
+
+static void nowarn_a1_extern_bufx (struct A1 *p)
+{
+  p->a1[0] = 0; p->a1[99] = 99; p->a1[999] = 999; p->a1[9999] = 9999;
+}
+
+static void nowarn_a1_ref (struct A1 *p)
+{
+  p->a1[0] = 0; p->a1[99] = 99; p->a1[999] = 999; p->a1[9999] = 9999;
+}
+
+void test_a1 (struct A1 *p, unsigned n)
+{
+  {
+    struct A1 a1;
+    warn_a1_local_noinit (&a1);
+    sink (&a1);
+  }
+
+  {
+    extern struct A1 a1x;
+    warn_a1_extern (&a1x);
+    sink (&a1x);
+}
+  {
+    struct A1 a1 = { 0, { 1 } };
+    warn_a1_init (&a1);
+    sink (&a1);
+  }
+
+  {
+    /* Verify out-of-bounds access to the local BUF is diagnosed.  */
+    char buf_p2[sizeof (struct A1) + 2 * sizeof (int16_t)];
+    warn_a1_local_buf ((struct A1*) buf_p2);
+    sink (buf_p2);
+  }
+
+  {
+    /* Verify out-of-bounds access to the extern BUF with a known
+       bound is diagnosed.  */
+    extern char a1_buf_p3[sizeof (struct A1) + 3 * sizeof (int16_t)];
+    warn_a1_extern_buf ((struct A1*) a1_buf_p3);
+    sink (a1_buf_p3);
+  }
+
+  {
+    /* Verify that accesses to BUFX with an unknown bound are not
+       diagnosed.  */
+    extern char bufx[];
+    nowarn_a1_extern_bufx ((struct A1*) bufx);
+    sink (bufx);
+  }
+
+  {
+    /* Verify that accesses to BUFN with a runtime bound are not
+       diagnosed.  */
+    char bufn[n];
+    nowarn_a1_extern_bufx ((struct A1*) bufn);
+    sink (bufn);
+  }
+
+  nowarn_a1_ref (p);
+}
+
+
+struct A2
+{
+  int32_t n;
+  int16_t a2[2];    // { dg-message "while referencing 'a2'" }
+};
+
+static void warn_a2_noinit (struct A2 *p)
+{
+  p->a2[0] = 0; p->a2[1] = 1;
+
+  p->a2[2] = 2;     // { dg-warning "\\\[-Warray-bounds" }
+}
+
+static void warn_a2_init (struct A2 *p)
+{
+  p->a2[0] = 0; p->a2[1] = 1;
+
+  p->a2[2] = 2;     // { dg-warning "\\\[-Warray-bounds" }
+  p->a2[9] = 9;     // { dg-warning "\\\[-Warray-bounds" }
+}
+
+static void warn_a2_ref (struct A2 *p)
+{
+  p->a2[0] = 0; p->a2[1] = 1;
+
+  p->a2[2] = 2;     // { dg-warning "\\\[-Warray-bounds" }
+  p->a2[9] = 9;     // { dg-warning "\\\[-Warray-bounds" }
+}
+
+void test_a2 (struct A2 *p)
+{
+  {
+    struct A2 a2;
+    warn_a2_noinit (&a2);
+    sink (&a2);
+  }
+
+  {
+    struct A2 a2 = { 0, { 1, 2 } };
+    warn_a2_init (&a2);
+    sink (&a2);
+  }
+
+  warn_a2_ref (p);
+}
Index: gcc/testsuite/gcc.dg/Warray-bounds-45.c
===================================================================
--- gcc/testsuite/gcc.dg/Warray-bounds-45.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/Warray-bounds-45.c	(working copy)
@@ -0,0 +1,114 @@ 
+/* PR middle-end/91647 - missing -Warray-bounds accessing a zero-length array
+   of a declared object
+   { dg-do "compile" }
+   { dg-options "-O2 -Wall" } */
+
+struct __attribute__ ((aligned (16))) A16
+{
+  __INT64_TYPE__ i8;
+  __INT16_TYPE__ i2;
+  __INT16_TYPE__ a2[];
+};
+
+struct A16 a0 = { };
+
+void test_a0 (void)
+{
+  a0.a2[0] = 0; a0.a2[1] = 1; a0.a2[2] = 2;
+
+  a0.a2[3] = 3;     // { dg-warning "array subscript 3 is above array bounds of 'short int\\\[]'" }
+}
+
+
+struct A16 a1 = { .a2 = { 1 } };
+
+void test_a1 (void)
+{
+  a1.a2[0] = 0; a1.a2[1] = 1; a1.a2[2] = 2;
+
+  a1.a2[3] = 3;     // { dg-warning "array subscript 3 is above array bounds of 'short int\\\[]'" }
+}
+
+
+struct A16 a2 = { .a2 = { 1, 2 } };
+
+void test_a2 (void)
+{
+  a2.a2[0] = 0; a2.a2[1] = 1; a2.a2[2] = 2;
+
+  a2.a2[3] = 3;     // { dg-warning "array subscript 3 is above array bounds of 'short int\\\[]'" }
+}
+
+
+struct A16 a3 = { .a2 = { 1, 2, 3 } };
+
+void test_a3 (void)
+{
+  a3.a2[0] = 0; a3.a2[1] = 1; a3.a2[2] = 2;
+
+  a3.a2[3] = 3;     // { dg-warning "array subscript 3 is above array bounds of 'short int\\\[]'" }
+}
+
+
+struct A16 a4 = { .a2 = { 1, 2, 3, 4 } };
+
+void test_a4 (void)
+{
+  a4.a2[0] = 0; a4.a2[1] = 1; a4.a2[2] = 2; a4.a2[3] = 3;
+
+  a4.a2[4] = 4;     // { dg-warning "array subscript 4 is above array bounds of 'short int\\\[]'" }
+}
+
+
+struct A16 a5 = { .a2 = { 1, 2, 3, 4, 5 } };
+
+void test_a5 (void)
+{
+  a5.a2[0] = 0; a5.a2[1] = 1; a5.a2[2] = 2; a5.a2[3] = 3; a5.a2[4] = 4;
+
+  a5.a2[5] = 5;     // { dg-warning "array subscript 5 is above array bounds of 'short int\\\[]'" }
+}
+
+
+struct A16 a6 = { .a2 = { 1, 2, 3, 4, 5, 6 } };
+
+void test_a6 (void)
+{
+  a6.a2[0] = 0; a6.a2[1] = 1; a6.a2[2] = 2; a6.a2[3] = 3; a6.a2[4] = 4;
+  a6.a2[5] = 5;
+
+  a6.a2[6] = 6;     // { dg-warning "array subscript 6 is above array bounds of 'short int\\\[]'" }
+}
+
+
+struct A16 a7 = { .a2 = { 1, 2, 3, 4, 5, 6, 7 } };
+
+void test_a7 (void)
+{
+  a7.a2[0] = 0; a7.a2[1] = 1; a7.a2[2] = 2; a7.a2[3] = 3; a7.a2[4] = 4;
+  a7.a2[5] = 5; a7.a2[5] = 5; a7.a2[6] = 6;
+
+  a7.a2[7] = 7;     // { dg-warning "array subscript 7 is above array bounds of 'short int\\\[]'" }
+}
+
+
+struct A16 a8 = { .a2 = { 1, 2, 3, 4, 5, 6, 7, 8 } };
+
+void test_a8 (void)
+{
+  a8.a2[0] = 0; a8.a2[1] = 1; a8.a2[2] = 2; a8.a2[3] = 3; a8.a2[4] = 4;
+  a8.a2[5] = 5; a8.a2[5] = 5; a8.a2[6] = 6; a8.a2[7] = 7;
+
+  a8.a2[8] = 8;     // { dg-warning "array subscript 8 is above array bounds of 'short int\\\[]'" }
+}
+
+
+struct A16 a9 = { .a2 = { 1, 2, 3, 4, 5, 6, 7, 8, 9 } };
+
+void test_a9 (void)
+{
+  a8.a2[0] = 8; a8.a2[1] = 7; a8.a2[2] = 6; a8.a2[3] = 5; a8.a2[4] = 4;
+  a8.a2[5] = 3; a8.a2[5] = 2; a8.a2[6] = 1; a8.a2[7] = 0;
+
+  a8.a2[9] = 8;     // { dg-warning "array subscript 9 is above array bounds of 'short int\\\[]'" }
+}
Index: gcc/testsuite/gcc.dg/Wstringop-overflow-16.c
===================================================================
--- gcc/testsuite/gcc.dg/Wstringop-overflow-16.c	(revision 275387)
+++ gcc/testsuite/gcc.dg/Wstringop-overflow-16.c	(working copy)
@@ -3,7 +3,7 @@ 
    { dg-options "-O2 -Wall" } */
 
 struct charseq {
-  unsigned char bytes[0];         // { dg-message "object declared here" }
+  unsigned char bytes[0];         // { dg-message "while referencing|object declared here" }
 };
 
 struct locale_ctype_t {
@@ -15,7 +15,7 @@  void ctype_finish (struct locale_ctype_t *ctype)
   long unsigned int cnt;
   for (cnt = 0; cnt < 20; ++cnt) {
     static struct charseq replace[2];
-    replace[0].bytes[1] = '\0';   // { dg-warning "\\\[-Wstringop-overflow" }
+    replace[0].bytes[1] = '\0';   // { dg-warning "\\\[-Warray-bounds|-Wstringop-overflow" }
     ctype->mboutdigits[cnt] = &replace[0];
   }
 }
Index: gcc/testsuite/gcc.dg/pr36902.c
===================================================================
--- gcc/testsuite/gcc.dg/pr36902.c	(revision 275387)
+++ gcc/testsuite/gcc.dg/pr36902.c	(working copy)
@@ -44,7 +44,7 @@  foo2(unsigned char * to, const unsigned char * fro
       *to = *from;
       break;
     case 5:
-      to[4] = from [4]; /* { dg-warning "array subscript is above array bounds" "" { xfail *-*-* } } */
+      to[4] = from [4]; /* { dg-warning "\\\[-Warray-bounds } */
       break;
     }
   return to;
Index: gcc/testsuite/gcc.dg/strlenopt-57.c
===================================================================
--- gcc/testsuite/gcc.dg/strlenopt-57.c	(revision 275387)
+++ gcc/testsuite/gcc.dg/strlenopt-57.c	(working copy)
@@ -21,7 +21,7 @@  void test_var_flexarray_cst_off (void)
 {
   /* Use arbitrary constants greater than 16 in case GCC ever starts
      unrolling strlen() calls with small array arguments.  */
-  a[0] = 17 < strlen (a0.a + 1);
+  a[0] = 17 < strlen (a0.a + 1);        // { dg-warning "\\\[-Warray-bounds" }
   a[1] = 19 < strlen (a1.a + 1);
   a[2] = 23 < strlen (a9.a + 9);
   a[3] = 29 < strlen (ax.a + 3);
Index: gcc/tree-vrp.c
===================================================================
--- gcc/tree-vrp.c	(revision 275387)
+++ gcc/tree-vrp.c	(working copy)
@@ -4450,11 +4450,22 @@  vrp_prop::check_array_ref (location_t location, tr
 	}
       else
 	{
-	  tree maxbound = TYPE_MAX_VALUE (ptrdiff_type_node);
+	  tree ptrdiff_max = TYPE_MAX_VALUE (ptrdiff_type_node);
+	  tree maxbound = ptrdiff_max;
 	  tree arg = TREE_OPERAND (ref, 0);
 	  poly_int64 off;
 
-	  if (get_addr_base_and_unit_offset (arg, &off) && known_gt (off, 0))
+	  if (TREE_CODE (arg) == COMPONENT_REF)
+	    {
+	      /* Try to determine the size of the trailing array from
+		 its initializer (if it has one).  */
+	      if (tree refsize = component_ref_size (arg))
+		maxbound = refsize;
+	    }
+
+	  if (maxbound == ptrdiff_max
+	      && get_addr_base_and_unit_offset (arg, &off)
+	      && known_gt (off, 0))
 	    maxbound = wide_int_to_tree (sizetype,
 					 wi::sub (wi::to_wide (maxbound),
 						  off));
@@ -4689,18 +4700,23 @@  vrp_prop::check_mem_ref (location_t location, tree
   /* The type of the object being referred to.  It can be an array,
      string literal, or a non-array type when the MEM_REF represents
      a reference/subscript via a pointer to an object that is not
-     an element of an array.  References to members of structs and
-     unions are excluded because MEM_REF doesn't make it possible
-     to identify the member where the reference originated.
-     Incomplete types are excluded as well because their size is
-     not known.  */
+     an element of an array.  Incomplete types are excluded as well
+     because their size is not known.  */
   tree reftype = TREE_TYPE (arg);
   if (POINTER_TYPE_P (reftype)
       || !COMPLETE_TYPE_P (reftype)
-      || TREE_CODE (TYPE_SIZE_UNIT (reftype)) != INTEGER_CST
-      || RECORD_OR_UNION_TYPE_P (reftype))
+      || TREE_CODE (TYPE_SIZE_UNIT (reftype)) != INTEGER_CST)
     return false;
 
+  /* Except in declared objects, references to trailing array members
+     of structs and union objects are excluded because MEM_REF doesn't
+     make it possible to identify the member where the reference
+     originated.  */
+  if (RECORD_OR_UNION_TYPE_P (reftype)
+      && (!VAR_P (arg)
+	  || (DECL_EXTERNAL (arg) && array_at_struct_end_p (ref))))
+    return false;
+
   arrbounds[0] = 0;
 
   offset_int eltsize;
@@ -4710,7 +4726,14 @@  vrp_prop::check_mem_ref (location_t location, tree
       if (tree dom = TYPE_DOMAIN (reftype))
 	{
 	  tree bnds[] = { TYPE_MIN_VALUE (dom), TYPE_MAX_VALUE (dom) };
-	  if (array_at_struct_end_p (arg) || !bnds[0] || !bnds[1])
+	  if (TREE_CODE (arg) == COMPONENT_REF)
+	    {
+	      offset_int size = maxobjsize;
+	      if (tree fldsize = component_ref_size (arg))
+		size = wi::to_offset (fldsize);
+	      arrbounds[1] = wi::lrshift (size, wi::floor_log2 (eltsize));
+	    }
+	  else if (array_at_struct_end_p (arg) || !bnds[0] || !bnds[1])
 	    arrbounds[1] = wi::lrshift (maxobjsize, wi::floor_log2 (eltsize));
 	  else
 	    arrbounds[1] = (wi::to_offset (bnds[1]) - wi::to_offset (bnds[0])
Index: gcc/tree.c
===================================================================
--- gcc/tree.c	(revision 275387)
+++ gcc/tree.c	(working copy)
@@ -67,6 +67,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "rtl.h"
 #include "regs.h"
 #include "tree-vector-builder.h"
+#include "gimple-fold.h"
 
 /* Tree code classes.  */
 
@@ -3097,6 +3098,25 @@  first_field (const_tree type)
   return t;
 }
 
+/* Returns the last FIELD_DECL in the TYPE_FIELDS of the RECORD_TYPE or
+   UNION_TYPE TYPE, or NULL_TREE if none.  */
+
+tree
+last_field (const_tree type)
+{
+  tree last = NULL_TREE;
+
+  for (tree fld = TYPE_FIELDS (type); fld; fld = TREE_CHAIN (fld))
+    {
+      if (TREE_CODE (fld) != FIELD_DECL)
+	continue;
+
+      last = fld;
+    }
+
+  return last;
+}
+
 /* Concatenate two chains of nodes (chained through TREE_CHAIN)
    by modifying the last node in chain 1 to point to chain 2.
    This is the Lisp primitive `nconc'.  */
@@ -13725,8 +13745,8 @@  array_ref_up_bound (tree exp)
   return NULL_TREE;
 }
 
-/* Returns true if REF is an array reference or a component reference
-   to an array at the end of a structure.
+/* Returns true if REF is an array reference, component reference,
+   or memory reference to an array at the end of a structure.
    If this is the case, the array may be allocated larger
    than its upper bound implies.  */
 
@@ -13744,6 +13764,28 @@  array_at_struct_end_p (tree ref)
   else if (TREE_CODE (ref) == COMPONENT_REF
 	   && TREE_CODE (TREE_TYPE (TREE_OPERAND (ref, 1))) == ARRAY_TYPE)
     atype = TREE_TYPE (TREE_OPERAND (ref, 1));
+  else if (TREE_CODE (ref) == MEM_REF)
+    {
+      tree arg = TREE_OPERAND (ref, 0);
+      if (TREE_CODE (arg) == ADDR_EXPR)
+	arg = TREE_OPERAND (arg, 0);
+      tree argtype = TREE_TYPE (arg);
+      if (TREE_CODE (argtype) == RECORD_TYPE)
+	{
+	  if (tree fld = last_field (argtype))
+	    {
+	      atype = TREE_TYPE (fld);
+	      if (TREE_CODE (atype) != ARRAY_TYPE)
+		return false;
+	      if (VAR_P (arg) && DECL_SIZE (fld))
+		return false;
+	    }
+	  else
+	    return false;
+	}
+      else
+	return false;
+    }
   else
     return false;
 
@@ -13860,6 +13902,134 @@  component_ref_field_offset (tree exp)
     return SUBSTITUTE_PLACEHOLDER_IN_EXPR (DECL_FIELD_OFFSET (field), exp);
 }
 
+/* Given the initializer INIT, return the initializer for the field
+   DECL if it exists, otherwise null.  Used to obtain the initializer
+   for a flexible array member and determine its size.  */
+
+static tree
+get_initializer_for (tree init, tree decl)
+{
+  STRIP_NOPS (init);
+
+  tree fld, fld_init;
+  unsigned HOST_WIDE_INT i;
+  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (init), i, fld, fld_init)
+    {
+      if (decl == fld)
+	return fld_init;
+
+      if (TREE_CODE (fld) == CONSTRUCTOR)
+	{
+	  fld_init = get_initializer_for (fld_init, decl);
+	  if (fld_init)
+	    return fld_init;
+	}
+    }
+
+  return NULL_TREE;
+}
+
+/* Determine the size of the flexible array FLD from the initializer
+   expression for the struct object DECL in which the meber is declared
+   (possibly recursively).  Return the size or zero constant if it isn't
+   initialized.  */
+
+static tree
+get_flexarray_size (tree decl, tree fld)
+{
+  if (tree init = DECL_INITIAL (decl))
+    {
+      init = get_initializer_for (init, fld);
+      if (init)
+	return TYPE_SIZE_UNIT (TREE_TYPE (init));
+    }
+
+  return DECL_EXTERNAL (decl) ? NULL_TREE : integer_zero_node;
+}
+
+/* Determines the size of the member referenced by the COMPONENT_REF
+   REF, using its initializer expression if necessary in order to
+   determine the size of an initialized flexible array member.
+   Returns the size (which might be zero for an object with
+   an uninitialized flexible array member) or null if the size
+   cannot be determined.  */
+
+tree
+component_ref_size (tree ref)
+{
+  gcc_assert (TREE_CODE (ref) == COMPONENT_REF);
+
+  tree member = TREE_OPERAND (ref, 1);
+
+  /* If the member is not an array, or is not last, or is an array with
+     more than one element, return its size.  Otherwise it's either
+     a bona fide flexible array member, or a zero-length array member,
+     or an array of length one treated as such.  */
+  tree size = DECL_SIZE_UNIT (member);
+  if (size)
+    {
+      tree memtype = TREE_TYPE (member);
+      if (TREE_CODE (memtype) != ARRAY_TYPE
+	  || !array_at_struct_end_p (ref))
+	return size;
+
+      if (!integer_zerop (size))
+	if (tree dom = TYPE_DOMAIN (memtype))
+	  if (tree min = TYPE_MIN_VALUE (dom))
+	    if (tree max = TYPE_MAX_VALUE (dom))
+	      if (TREE_CODE (min) == INTEGER_CST
+		  && TREE_CODE (max) == INTEGER_CST)
+		{
+		  offset_int minidx = wi::to_offset (min);
+		  offset_int maxidx = wi::to_offset (max);
+		  if (maxidx - minidx > 0)
+		    return size;
+		}
+    }
+
+  /* If the reference is to a declared object and the member a true
+     flexible array, try to determine its size from its initializer.  */
+  poly_int64 off = 0;
+  tree base = get_addr_base_and_unit_offset (ref, &off);
+  if (!base || !VAR_P (base))
+    return NULL_TREE;
+
+  /* The size of any member of a declared object other than a flexible
+     array member is that obtained above.  */
+  tree rec = TREE_OPERAND (ref, 0);
+  tree basetype = TREE_TYPE (base);
+  bool typematch = useless_type_conversion_p (TREE_TYPE (rec), basetype);
+  if (size && typematch)
+    return size;
+
+  size = get_flexarray_size (base, member);
+
+  /* If the flexible array member has an known size use the greater
+     of it and the tail padding in the enclosing struct.
+     Otherwise, when the size of the flexible array member is unknown
+     and the referenced object is not a struct, use the size of its
+     type when known.  This detects sizes of array buffers when cast
+     to struct tyoes with flexible array members.  */
+  if (size || !RECORD_OR_UNION_TYPE_P (basetype))
+    if (tree bsz = TYPE_SIZE_UNIT (basetype))
+      if (TREE_CODE (bsz) == INTEGER_CST)
+	{
+	  poly_int64 basesize = tree_to_poly_int64 (bsz);
+	  poly_int64 initsize = size ? tree_to_poly_int64 (size) : 0;
+
+	  poly_int64 memsize = basesize - off;
+	  if (known_lt (initsize, memsize))
+	    return wide_int_to_tree (TREE_TYPE (bsz), memsize);
+	  return size;
+	}
+
+  /* Return "don't know" for an external non-array object since its
+     flexible array member can be initialized to have any number of
+     elements.  Otherwise, return zero because the flexible array
+     member has no elements.  */
+  return (DECL_EXTERNAL (base) ? NULL_TREE : integer_zero_node);
+}
+
 /* Return the machine mode of T.  For vectors, returns the mode of the
    inner type.  The main use case is to feed the result to HONOR_NANS,
    avoiding the BLKmode that a direct TYPE_MODE (T) might return.  */
Index: gcc/tree.h
===================================================================
--- gcc/tree.h	(revision 275387)
+++ gcc/tree.h	(working copy)
@@ -5259,6 +5259,13 @@  extern bool array_at_struct_end_p (tree);
    by EXP.  This does not include any offset in DECL_FIELD_BIT_OFFSET.  */
 extern tree component_ref_field_offset (tree);
 
+/* Return the size of the member referenced by the COMPONENT_REF, using
+   its initializer expression if necessary in order to determine the size
+   of an initialized flexible array member.  The size might be zero for
+   an object with an uninitialized flexible array member or null if it
+   cannot be determined.  */
+extern tree component_ref_size (tree);
+
 extern int tree_map_base_eq (const void *, const void *);
 extern unsigned int tree_map_base_hash (const void *);
 extern int tree_map_base_marked_p (const void *);
Index: gcc/wide-int.h
===================================================================
--- gcc/wide-int.h	(revision 275387)
+++ gcc/wide-int.h	(working copy)
@@ -852,6 +852,8 @@  inline HOST_WIDE_INT
 generic_wide_int <storage>::sign_mask () const
 {
   unsigned int len = this->get_len ();
+  gcc_assert (len > 0);
+
   unsigned HOST_WIDE_INT high = this->get_val ()[len - 1];
   if (!is_sign_extended)
     {