diff mbox series

[V1,1/3] Provide element_count attribute to flexible array member field (PR108896)

Message ID 20230525161450.3704901-2-qing.zhao@oracle.com
State New
Headers show
Series New attribute "element_count" to annotate bounds for C99 FAM(PR108896) | expand

Commit Message

Qing Zhao May 25, 2023, 4:14 p.m. UTC
'element_count ("COUNT")'
     The 'element_count' attribute may be attached to the flexible array
     member of a structure.  It indicates that the number of the
     elements of the array is given by the field named "COUNT" in the
     same structure as the flexible array member.  GCC uses this
     information to improve the results of
     '__builtin_dynamic_object_size' and array bound sanitizer.

     For instance, the following declaration:

          struct P {
            size_t count;
            int array[] __attribute__ ((element_count ("count")));
          };

     specify that 'array' is a flexible array member whose number of
     element is given by the field "'count'" in the same structure.

The number of elements information provided by this attribute can be
used by __builtin_dynamic_object_size and array bound sanitizer to detect
out-of-bound errors for flexible array member references.

2023-05-17 Qing Zhao <qing.zhao@oracle.com>

gcc/c-family/ChangeLog:

	PR C/108896
	* c-attribs.cc (handle_element_count_attribute): New function.
	* c-common.cc (c_flexible_array_member_type_p): To this.
	* c-common.h (c_flexible_array_member_type_p): New prototype.

gcc/c/ChangeLog:

	PR C/108896
	* c-decl.cc (flexible_array_member_type_p): Renamed and moved to...
	(add_flexible_array_elts_to_size): Use renamed function.
	(is_flexible_array_member_p): Use renamed function.
	(verify_element_count_attribute): New function.
	(finish_struct): Use renamed function and verify element count
	attribute.

gcc/ChangeLog:

	PR C/108896
	* doc/extend.texi: Document attribute element_count.

gcc/testsuite/ChangeLog:

	PR C/108896
	* gcc.dg/flex-array-element-count.c: New test.
---
 gcc/c-family/c-attribs.cc                     | 51 ++++++++++++++++
 gcc/c-family/c-common.cc                      | 13 ++++
 gcc/c-family/c-common.h                       |  1 +
 gcc/c/c-decl.cc                               | 61 ++++++++++++++-----
 gcc/doc/extend.texi                           | 21 +++++++
 .../gcc.dg/flex-array-element-count.c         | 27 ++++++++
 6 files changed, 158 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-element-count.c

Comments

Joseph Myers May 25, 2023, 9:02 p.m. UTC | #1
What happens if the field giving the number of elements is in a contained 
anonymous structure or union?

struct s {
  struct { size_t count; };
  int array[] __attribute__ ((element_count ("count")));
};

This ought to work - a general principle in C is that anonymous structures 
and unions are transparent as far as name lookup for fields is concerned.  
But I don't see any testcases for it and I'm not sure it would work with 
the present code.

What if the string is a wide string?  I don't expect that to work (either 
as a matter of interface design, or in the present code), but I think that 
case should have a specific check and error.

What happens in the case where -fexec-charset specifies a 
non-ASCII-compatible character set?  I expect that to work OK with the 
existing code, because translation of string literals to the execution 
character set is disabled in __attribute__ parsing, but having a testcase 
for it would be good.

What happens if the field referenced for the element count does not have 
integer type?  I'd expect an error, but don't see one in the code or tests 
here.
Qing Zhao May 26, 2023, 1:32 p.m. UTC | #2
> On May 25, 2023, at 5:02 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> 
> What happens if the field giving the number of elements is in a contained 
> anonymous structure or union?
> 
> struct s {
>  struct { size_t count; };
>  int array[] __attribute__ ((element_count ("count")));
> };
> 
> This ought to work - a general principle in C is that anonymous structures 
> and unions are transparent as far as name lookup for fields is concerned.  
> But I don't see any testcases for it and I'm not sure it would work with 
> the present code.
Will check on this and add testing case for this case.
> 
> What if the string is a wide string?  I don't expect that to work (either 
> as a matter of interface design, or in the present code), but I think that 
> case should have a specific check and error.

Dump question: how to check whether the string is a wide string? -:)

> 
> What happens in the case where -fexec-charset specifies a 
> non-ASCII-compatible character set?  I expect that to work OK with the 
> existing code, because translation of string literals to the execution 
> character set is disabled in __attribute__ parsing, but having a testcase 
> for it would be good.
Okay, will check on this and add test case.
> 
> What happens if the field referenced for the element count does not have 
> integer type?  I'd expect an error, but don't see one in the code or tests 
> here.
Yes, that’s right, will add this.

Thanks a lot.

Qing

> 
> -- 
> Joseph S. Myers
> joseph@codesourcery.com
Joseph Myers May 26, 2023, 6:15 p.m. UTC | #3
On Fri, 26 May 2023, Qing Zhao via Gcc-patches wrote:

> > What if the string is a wide string?  I don't expect that to work (either 
> > as a matter of interface design, or in the present code), but I think that 
> > case should have a specific check and error.
> 
> Dump question: how to check whether the string is a wide string? -:)

By examining the element type; the only valid case for the attribute would 
be an element type of (const) char.  (I think it's reasonable to reject 
all of char8_t, char16_t, char32_t, wchar_t strings in this context.)
Qing Zhao May 26, 2023, 7:09 p.m. UTC | #4
> On May 26, 2023, at 2:15 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> 
> On Fri, 26 May 2023, Qing Zhao via Gcc-patches wrote:
> 
>>> What if the string is a wide string?  I don't expect that to work (either 
>>> as a matter of interface design, or in the present code), but I think that 
>>> case should have a specific check and error.
>> 
>> Dump question: how to check whether the string is a wide string? -:)
> 
> By examining the element type; the only valid case for the attribute would 
> be an element type of (const) char.  (I think it's reasonable to reject 
> all of char8_t, char16_t, char32_t, wchar_t strings in this context.)

Okay, will study a little bit and do this.
Thanks

Qing
> 
> -- 
> Joseph S. Myers
> joseph@codesourcery.com
Qing Zhao June 7, 2023, 7:59 p.m. UTC | #5
Hi, Joseph,

A question here:  can an identifier in C be a wide char string? 

Qing

> On May 26, 2023, at 2:15 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> 
> On Fri, 26 May 2023, Qing Zhao via Gcc-patches wrote:
> 
>>> What if the string is a wide string?  I don't expect that to work (either 
>>> as a matter of interface design, or in the present code), but I think that 
>>> case should have a specific check and error.
>> 
>> Dump question: how to check whether the string is a wide string? -:)
> 
> By examining the element type; the only valid case for the attribute would 
> be an element type of (const) char.  (I think it's reasonable to reject 
> all of char8_t, char16_t, char32_t, wchar_t strings in this context.)
> 
> -- 
> Joseph S. Myers
> joseph@codesourcery.com
Joseph Myers June 7, 2023, 8:53 p.m. UTC | #6
On Wed, 7 Jun 2023, Qing Zhao via Gcc-patches wrote:

> Hi, Joseph,
> 
> A question here:  can an identifier in C be a wide char string? 

Identifiers and strings are different kinds of tokens; an identifier can't 
be a string of any kind, wide or narrow.  It just so happens that the 
proposed interface here involves interpreting the contents of a string as 
referring to an identifier (presumably for parsing convenience compared to 
using an identifier directly in an attribute).
Qing Zhao June 7, 2023, 9:32 p.m. UTC | #7
> On Jun 7, 2023, at 4:53 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> 
> On Wed, 7 Jun 2023, Qing Zhao via Gcc-patches wrote:
> 
>> Hi, Joseph,
>> 
>> A question here:  can an identifier in C be a wide char string? 
> 
> Identifiers and strings are different kinds of tokens; an identifier can't 
> be a string of any kind, wide or narrow.  It just so happens that the 
> proposed interface here involves interpreting the contents of a string as 
> referring to an identifier (presumably for parsing convenience compared to 
> using an identifier directly in an attribute).

Are you suggesting to use identifier directly as the argument of the attribute?
I tried this in the beginning, however, the current parser for the attribute argument can not identify that this identifier is a field identifier inside the same structure. 

For example:

int count;
struct trailing_array_7 {
  Int count;
  int array_7[] __attribute ((element_count (count))); 
};

The identifier “count” inside the attribute will refer to the variable “int count” outside of the structure.

We need to introduce new syntax for this and also need to update the parser of the attribute.
Not sure at this moment whether the extra effort is necessary or not?
Any suggestions?

thanks.

Qing

> 
> -- 
> Joseph S. Myers
> joseph@codesourcery.com
Joseph Myers June 7, 2023, 10:05 p.m. UTC | #8
On Wed, 7 Jun 2023, Qing Zhao via Gcc-patches wrote:

> Are you suggesting to use identifier directly as the argument of the attribute?
> I tried this in the beginning, however, the current parser for the attribute argument can not identify that this identifier is a field identifier inside the same structure. 
> 
> For example:
> 
> int count;
> struct trailing_array_7 {
>   Int count;
>   int array_7[] __attribute ((element_count (count))); 
> };
> 
> The identifier “count” inside the attribute will refer to the variable 
> “int count” outside of the structure.

c_parser_attribute_arguments is supposed to allow an identifier as an 
attribute argument - and not look it up (the user of the attribute would 
later need to look it up in the context of the containing structure).  
Callers use attribute_takes_identifier_p to determine which attributes 
take identifiers (versus expressions) as arguments, which would need 
updating to cover the new attribute.

There is a ??? comment about the case where the identifier is declared as 
a type name.  That would simply be one of the cases carried over from the 
old Bison parser, and it would seem reasonable to remove that 
special-casing so that the attribute works even when the identifier is 
declared as a typedef name as an ordinary identifier, since it's fine for 
structure members to have the same name as a typedef name.

Certainly taking an identifier directly seems like cleaner syntax than 
taking a string that then needs reinterpreting as an identifier.
Qing Zhao June 8, 2023, 1:06 p.m. UTC | #9
> On Jun 7, 2023, at 6:05 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> 
> On Wed, 7 Jun 2023, Qing Zhao via Gcc-patches wrote:
> 
>> Are you suggesting to use identifier directly as the argument of the attribute?
>> I tried this in the beginning, however, the current parser for the attribute argument can not identify that this identifier is a field identifier inside the same structure. 
>> 
>> For example:
>> 
>> int count;
>> struct trailing_array_7 {
>>  Int count;
>>  int array_7[] __attribute ((element_count (count))); 
>> };
>> 
>> The identifier “count” inside the attribute will refer to the variable 
>> “int count” outside of the structure.
> 
> c_parser_attribute_arguments is supposed to allow an identifier as an 
> attribute argument - and not look it up (the user of the attribute would 
> later need to look it up in the context of the containing structure).  
> Callers use attribute_takes_identifier_p to determine which attributes 
> take identifiers (versus expressions) as arguments, which would need 
> updating to cover the new attribute.
> 
> There is a ??? comment about the case where the identifier is declared as 
> a type name.  That would simply be one of the cases carried over from the 
> old Bison parser, and it would seem reasonable to remove that 
> special-casing so that the attribute works even when the identifier is 
> declared as a typedef name as an ordinary identifier, since it's fine for 
> structure members to have the same name as a typedef name.
> 
> Certainly taking an identifier directly seems like cleaner syntax than 
> taking a string that then needs reinterpreting as an identifier.

Thanks a lot for the helpful info. I will study a little bit here to see how to do this.
Qing
> 
> -- 
> Joseph S. Myers
> joseph@codesourcery.com
Qing Zhao June 15, 2023, 3:09 p.m. UTC | #10
Hi, Joseph,

I studied c_parser_attribute_arguments and related source code, 
and also studied PLACEHOLDER_EXPR and related source code. 

Right now, I still cannot decide what’s the best user-interface should be for the 
argument of the new attribute “element_count". (Or the new attribute 
name “counted_by” as suggested by Kees). 

There are 3 possible interfaces for the argument of the new attribute “counted_by”:

The first 2 interfaces are the following A and B:

A. the argument of the new attribute “counted_by” is a string as the current patch:

struct trailing_array_A {
  Int count;
  int array_A[] __attribute ((counted_by (“count"))); 
};

B. The argument of the new attribute “counted_by” is an identifier that can be
 accepted by “c_parser_attribute_arguments”:

struct trailing_array_B {
  Int count;
  int array_B[] __attribute ((counted_by (count))); 
};

To implement this interface,  we need to adjust “attribute_takes_identifier_p” to 
accept the new attribute “counted_by” and then interpret this new identifier  “count” as a 
field of the containing structure by looking at the name.  (Otherwise, the identifier “count”
will be treated as an identifier in the current scope, which is not declared yet)

Comparing B with A, I don’t see too much benefit, either from user-interface point of view, 
or from implementation point of view.  

For implementation, both A and B need to search the fields of the containing structure by 
the name of the field “count”.

For user interface, I think that A and B are similar.

In addition to consider the user-interface and implementation, another concern is the possibility
 to extend the argument of this new attribute to an expression in the future, for example:

struct trailing_array_F {
  Int count;
  int array_F[] __attribute ((counted_by (count * 4))); 
};

In the above struct “trailing_array_F”, the argument of the attribute “counted_by” is “count * 4”, which
is an expression.  

If we plan to extend the argument of this new attribute to an expression, then neither A nor B is
good, right?

For this purpose, it might be cleaner to introduce a new syntax similar as the designator as mentioned in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108896 as following, i.e, the approach C:

C. The argument of the new attribute “counted_by” is a new designator identifier that will be parsed as
The field of the containing structure:

struct trailing_array_C {
  Int count;
  int array_C[] __attribute ((counted_by (.count))); 
};

I think that after the C FE accepts this new designator syntax as the argument of the attribute, then it’s
very easy to extend the argument to an arbitrary expression later. 

For the implementation of this approach, my current thinking is: 

1. Update the routine “c_parser_postfix_expression” (is this the right place? ) to accept the new designator syntax.
2. Use “PLACEHOLDER_EXPR” to represent the containing structure, and build a COMPONENT_REF to hold
    the argument of the attribute in the IR.
3. When using this attribute in middle-end or sanitizer, use SUBSTITUTE_PLACEHOLDER_IN_EXPR(EXP, OBJ)
    to get the size info in IR. 

So, right now, I think that we might need to take the approach C?

What’s your opinion and suggestions?

Thanks a lot for your help.

Qing


> On Jun 7, 2023, at 6:05 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> 
> On Wed, 7 Jun 2023, Qing Zhao via Gcc-patches wrote:
> 
>> Are you suggesting to use identifier directly as the argument of the attribute?
>> I tried this in the beginning, however, the current parser for the attribute argument can not identify that this identifier is a field identifier inside the same structure. 
>> 
>> For example:
>> 
>> int count;
>> struct trailing_array_7 {
>>  Int count;
>>  int array_7[] __attribute ((element_count (count))); 
>> };
>> 
>> The identifier “count” inside the attribute will refer to the variable 
>> “int count” outside of the structure.
> 
> c_parser_attribute_arguments is supposed to allow an identifier as an 
> attribute argument - and not look it up (the user of the attribute would 
> later need to look it up in the context of the containing structure).  
> Callers use attribute_takes_identifier_p to determine which attributes 
> take identifiers (versus expressions) as arguments, which would need 
> updating to cover the new attribute.
> 
> There is a ??? comment about the case where the identifier is declared as 
> a type name.  That would simply be one of the cases carried over from the 
> old Bison parser, and it would seem reasonable to remove that 
> special-casing so that the attribute works even when the identifier is 
> declared as a typedef name as an ordinary identifier, since it's fine for 
> structure members to have the same name as a typedef name.
> 
> Certainly taking an identifier directly seems like cleaner syntax than 
> taking a string that then needs reinterpreting as an identifier.
> 
> -- 
> Joseph S. Myers
> joseph@codesourcery.com
Joseph Myers June 15, 2023, 4:55 p.m. UTC | #11
On Thu, 15 Jun 2023, Qing Zhao via Gcc-patches wrote:

> Comparing B with A, I don’t see too much benefit, either from 
> user-interface point of view, or from implementation point of view.
> 
> For implementation, both A and B need to search the fields of the 
> containing structure by the name of the field “count”.
> 
> For user interface, I think that A and B are similar.

But as a language design matter, there are no standard C interfaces that 
interpret a string as an identifier, so doing so does not fit well with 
the language.

> 1. Update the routine “c_parser_postfix_expression” (is this the right 
> place? ) to accept the new designator syntax.

Any design that might work with an expression is the sort of thing that 
would likely involve many iterations on the specification (i.e. proposed 
wording changes to the C standard) for the interpretation of the new kinds 
of expressions, including how to resolve syntactic ambiguities and how 
name lookup works, before it could be considered ready to implement, and 
then a lot more work on the specification based on implementation 
experience.

Note that no expressions can start with the '.' token at present.  As soon 
as you invent a new kind of expression that can start with that token, you 
have syntactic ambiguity.

struct s1 { int c; char a[(struct s2 { int c; char b[.c]; }) {.c=.c}.c]; };

Is ".c=.c" a use of the existing syntax for designated initializers, with 
the first ".c" being a designator and the second being a use of the new 
kind of expression, or is it an assignment expression, where both the LHS 
and the RHS of the assignment use the new kind of expression?  And do 
those .c, when the use the new kind of expression, refer to the inner or 
outer struct definition?

There are obvious advantages to using tokens that don't introduce such an 
ambiguity with designators (i.e., not '.' as the token to start the new 
kind of expression, but something that cannot start a designator), if such 
tokens can be found.  But you still have the name lookup question when 
there are multiple nested structure definitions.  And the question of when 
expressions are considered to be evaluated, if they have side effects such 
as ".c=.c" does.

"Whatever falls out of the implementation" is not a good approach for 
language design here.  If you want a new kind of expressions here, you 
need a careful multi-implementation design phase that produces a proper 
specification and has good reasons for the particular choices made in 
cases of ambiguity.
Qing Zhao June 15, 2023, 7:54 p.m. UTC | #12
> On Jun 15, 2023, at 12:55 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> 
> On Thu, 15 Jun 2023, Qing Zhao via Gcc-patches wrote:
> 
>> Comparing B with A, I don’t see too much benefit, either from 
>> user-interface point of view, or from implementation point of view.
>> 
>> For implementation, both A and B need to search the fields of the 
>> containing structure by the name of the field “count”.
>> 
>> For user interface, I think that A and B are similar.
> 
> But as a language design matter, there are no standard C interfaces that 
> interpret a string as an identifier, so doing so does not fit well with 
> the language.

Okay, makes sense.  So I will choose B over A. -:) 
> 
>> 1. Update the routine “c_parser_postfix_expression” (is this the right 
>> place? ) to accept the new designator syntax.
> 
> Any design that might work with an expression is the sort of thing that 
> would likely involve many iterations on the specification (i.e. proposed 
> wording changes to the C standard) for the interpretation of the new kinds 
> of expressions, including how to resolve syntactic ambiguities and how 
> name lookup works, before it could be considered ready to implement, and 
> then a lot more work on the specification based on implementation 
> experience.

Okay, I see the complication to make such new syntax into C standard…

> 
> Note that no expressions can start with the '.' token at present.  As soon 
> as you invent a new kind of expression that can start with that token, you 
> have syntactic ambiguity.
> 
> struct s1 { int c; char a[(struct s2 { int c; char b[.c]; }) {.c=.c}.c]; };
> 
> Is ".c=.c" a use of the existing syntax for designated initializers, with 
> the first ".c" being a designator and the second being a use of the new 
> kind of expression, or is it an assignment expression, where both the LHS 
> and the RHS of the assignment use the new kind of expression?  And do 
> those .c, when the use the new kind of expression, refer to the inner or 
> outer struct definition?

Okay, I see. Yes, this will be really confusing. 

> 
> There are obvious advantages to using tokens that don't introduce such an 
> ambiguity with designators (i.e., not '.' as the token to start the new 
> kind of expression, but something that cannot start a designator), if such 
> tokens can be found.  But you still have the name lookup question when 
> there are multiple nested structure definitions.  And the question of when 
> expressions are considered to be evaluated, if they have side effects such 
> as ".c=.c" does.
> 
> "Whatever falls out of the implementation" is not a good approach for 
> language design here.  If you want a new kind of expressions here, you 
> need a careful multi-implementation design phase that produces a proper 
> specification and has good reasons for the particular choices made in 
> cases of ambiguity.

Thanks a lot for your detailed explanation on the language design concerns. 
For this new attribute, I was convinced that it might not worth the effort to introduce a new syntax at this stage.

Another question,  whether it’s possible to extend such attribute later to accept expression as its argument if we take approach B:

B. The argument of the new attribute “counted_by” is an identifier that can be
accepted by “c_parser_attribute_arguments”:

struct trailing_array_B {
 Int count;
 int array_B[] __attribute ((counted_by (count))); 
};


From my current very limited understanding of the C FE source code, it’s not easy to extend the argument to an expression later for the above.
Is this understanding right?

(The motivation of accepting expression as the argument for the new attribute “counted_by” is 
   from the proposal for LLVM: https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854:

	• __counted_by(N) : The pointer points to memory that contains N elements of pointee type. N is an expression of integer type which can be a simple reference to declaration, a constant including calls to constant functions, or an arithmetic expression that does not have side effect. The annotation cannot apply to pointers to incomplete types or types without size such as  void *.
)
 
thanks.
Qing

> 
> -- 
> Joseph S. Myers
> joseph@codesourcery.com
Joseph Myers June 15, 2023, 10:48 p.m. UTC | #13
On Thu, 15 Jun 2023, Qing Zhao via Gcc-patches wrote:

> B. The argument of the new attribute “counted_by” is an identifier that can be
> accepted by “c_parser_attribute_arguments”:
> 
> struct trailing_array_B {
>  Int count;
>  int array_B[] __attribute ((counted_by (count))); 
> };
> 
> 
> From my current very limited understanding of the C FE source code, it’s 
> not easy to extend the argument to an expression later for the above. Is 
> this understanding right?

It wouldn't be entirely compatible: if you change to interpreting the 
argument as an expression, then the above would suggest a global variable 
count is used (as opposed to some other syntax for referring to an element 
of the containing structure).

So an attribute that takes an element name might best be a *different* 
attribute from any potential future one taking an expression (with some 
new syntax to refer to an element).
Martin Uecker June 16, 2023, 7:21 a.m. UTC | #14
Am Donnerstag, dem 15.06.2023 um 16:55 +0000 schrieb Joseph Myers:
> On Thu, 15 Jun 2023, Qing Zhao via Gcc-patches wrote:
> 
...
> > 1. Update the routine “c_parser_postfix_expression” (is this the right 
> > place? ) to accept the new designator syntax.
> 
> Any design that might work with an expression is the sort of thing that 
> would likely involve many iterations on the specification (i.e. proposed 
> wording changes to the C standard) for the interpretation of the new kinds 
> of expressions, including how to resolve syntactic ambiguities and how 
> name lookup works, before it could be considered ready to implement, and 
> then a lot more work on the specification based on implementation 
> experience.
> 
> Note that no expressions can start with the '.' token at present.  As soon 
> as you invent a new kind of expression that can start with that token, you 
> have syntactic ambiguity.
> 
> struct s1 { int c; char a[(struct s2 { int c; char b[.c]; }) {.c=.c}.c]; };
> 
> Is ".c=.c" a use of the existing syntax for designated initializers, with 
> the first ".c" being a designator and the second being a use of the new 
> kind of expression, or is it an assignment expression, where both the LHS 
> and the RHS of the assignment use the new kind of expression?  And do 
> those .c, when the use the new kind of expression, refer to the inner or 
> outer struct definition?

I would treat this is one integrated feature. Essentially .c is
somthing like this->c for the current struct for designated
initializer *and* size expressions because it is semantically 
so close.    In the initializer I would allow only 
the current use for designated initialization for all names of
member of the currently initialized struct,  so .c = .c would 
be invalid.   It should never refer to the outer struct if there
is a member with the same name in the inner struct, i.e. the
outside member is then hidden. 

So this would be ok:

struct s1 { int d; char a[(struct s2 { int c; char b[.c]; }) {.c=.d}.c]; };

Here the use of .d would be ok because it is not from the struct
currently initialized, but from an outside scope.

Martin
Qing Zhao June 16, 2023, 3:01 p.m. UTC | #15
> On Jun 15, 2023, at 6:48 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> 
> On Thu, 15 Jun 2023, Qing Zhao via Gcc-patches wrote:
> 
>> B. The argument of the new attribute “counted_by” is an identifier that can be
>> accepted by “c_parser_attribute_arguments”:
>> 
>> struct trailing_array_B {
>> Int count;
>> int array_B[] __attribute ((counted_by (count))); 
>> };
>> 
>> 
>> From my current very limited understanding of the C FE source code, it’s 
>> not easy to extend the argument to an expression later for the above. Is 
>> this understanding right?
> 
> It wouldn't be entirely compatible: if you change to interpreting the 
> argument as an expression, then the above would suggest a global variable 
> count is used (as opposed to some other syntax for referring to an element 
> of the containing structure).

Yeah, that’s the reason I tried to introduce the new “.count” syntax for the argument 
of the new attribute in the very beginning in order to avoid such incompatible issue later.  -:)
> 
> So an attribute that takes an element name might best be a *different* 
> attribute from any potential future one taking an expression (with some 
> new syntax to refer to an element).

So, if we add this “counted_by (identifier)” attribute now, and later we need to add another
 new attribute “new_counted_by (expression)”  at that time if needed?

Kees, what’s your opinion on this?

thanks.

Qing
> 
> -- 
> Joseph S. Myers
> joseph@codesourcery.com
Qing Zhao June 16, 2023, 3:14 p.m. UTC | #16
> On Jun 16, 2023, at 3:21 AM, Martin Uecker <uecker@tugraz.at> wrote:
> 
> Am Donnerstag, dem 15.06.2023 um 16:55 +0000 schrieb Joseph Myers:
>> On Thu, 15 Jun 2023, Qing Zhao via Gcc-patches wrote:
>> 
> ...
>>> 1. Update the routine “c_parser_postfix_expression” (is this the right 
>>> place? ) to accept the new designator syntax.
>> 
>> Any design that might work with an expression is the sort of thing that 
>> would likely involve many iterations on the specification (i.e. proposed 
>> wording changes to the C standard) for the interpretation of the new kinds 
>> of expressions, including how to resolve syntactic ambiguities and how 
>> name lookup works, before it could be considered ready to implement, and 
>> then a lot more work on the specification based on implementation 
>> experience.
>> 
>> Note that no expressions can start with the '.' token at present.  As soon 
>> as you invent a new kind of expression that can start with that token, you 
>> have syntactic ambiguity.
>> 
>> struct s1 { int c; char a[(struct s2 { int c; char b[.c]; }) {.c=.c}.c]; };
>> 
>> Is ".c=.c" a use of the existing syntax for designated initializers, with 
>> the first ".c" being a designator and the second being a use of the new 
>> kind of expression, or is it an assignment expression, where both the LHS 
>> and the RHS of the assignment use the new kind of expression?  And do 
>> those .c, when the use the new kind of expression, refer to the inner or 
>> outer struct definition?
> 
> I would treat this is one integrated feature. Essentially .c is
> somthing like this->c for the current struct for designated
> initializer *and* size expressions because it is semantically 
> so close.  

Yes, I think this is reasonable. (Is “this” the immediate containing structure?)

>  In the initializer I would allow only 
> the current use for designated initialization for all names of
> member of the currently initialized struct,  so .c = .c would 
> be invalid.

Given “.c” basically is “this->c”, then .c = .c should be considered as
this->c = this->c, is such self-initialization allowed in C?

>   It should never refer to the outer struct if there
> is a member with the same name in the inner struct, i.e. the
> outside member is then hidden. 

Does the above mean:  if there is NO name conflicting, it could refer to a field of an outer struct?

Why this is allowed? Why just disallow all referring to the field of the outer structure since .c basically is this->c?
> 
> So this would be ok:
> 
> struct s1 { int d; char a[(struct s2 { int c; char b[.c]; }) {.c=.d}.c]; };
> 
> Here the use of .d would be ok because it is not from the struct
> currently initialized, but from an outside scope.

I think that the above .c=.d should report an error, since .d does not exist in the containing structure.

Do I miss anything here?

thanks.

Qing
> 
> Martin
> 
> 
> 
>
Joseph Myers June 16, 2023, 4:21 p.m. UTC | #17
On Fri, 16 Jun 2023, Martin Uecker via Gcc-patches wrote:

> > Note that no expressions can start with the '.' token at present.  As soon 
> > as you invent a new kind of expression that can start with that token, you 
> > have syntactic ambiguity.
> > 
> > struct s1 { int c; char a[(struct s2 { int c; char b[.c]; }) {.c=.c}.c]; };
> > 
> > Is ".c=.c" a use of the existing syntax for designated initializers, with 
> > the first ".c" being a designator and the second being a use of the new 
> > kind of expression, or is it an assignment expression, where both the LHS 
> > and the RHS of the assignment use the new kind of expression?  And do 
> > those .c, when the use the new kind of expression, refer to the inner or 
> > outer struct definition?
> 
> I would treat this is one integrated feature. Essentially .c is
> somthing like this->c for the current struct for designated
> initializer *and* size expressions because it is semantically 
> so close.    In the initializer I would allow only 
> the current use for designated initialization for all names of
> member of the currently initialized struct,  so .c = .c would 
> be invalid.   It should never refer to the outer struct if there

I'm not clear on what the intended disambiguation rule here is, when "." 
is seen in initializer list context - does this rule depend on whether the 
following identifier is a member of the struct being initialized, so 
".c=.c" would be OK above if the initialized struct didn't have a member 
called c but the outer struct definition did?  That seems like a rather 
messy rule.  And does "would allow only" apply other than in the ambiguous 
context?  That seems to be implied by ".c=.c" being invalid above, because 
to make it invalid you need to disallow the new construct being used for 
the second .c, not just make the first .c interpreted as a designator.

Again, this sort of thing needs a detailed written specification, with 
multiple iterations discussed among different implementations.  The above 
paragraph doesn't make clear to me any of: the disambiguation rules; what 
is allowed in what context; how name lookup works (consider tricky cases 
such as a reference to an identifier declared *later* in the same struct, 
possibly in the context of C2x tag compatibility where a previous 
definition of the struct is visible); when these expressions get 
evaluated; what the underlying principles are behind those choices.

Using a token (existing or new) other than '.' - one that doesn't 
introduce ambiguity in any context where expressions can be used - would 
help significantly, although some of the issues would still apply.
Martin Uecker June 16, 2023, 5:07 p.m. UTC | #18
Am Freitag, dem 16.06.2023 um 16:21 +0000 schrieb Joseph Myers:
> On Fri, 16 Jun 2023, Martin Uecker via Gcc-patches wrote:
> 
> > > Note that no expressions can start with the '.' token at present.  As soon 
> > > as you invent a new kind of expression that can start with that token, you 
> > > have syntactic ambiguity.
> > > 
> > > struct s1 { int c; char a[(struct s2 { int c; char b[.c]; }) {.c=.c}.c]; };
> > > 
> > > Is ".c=.c" a use of the existing syntax for designated initializers, with 
> > > the first ".c" being a designator and the second being a use of the new 
> > > kind of expression, or is it an assignment expression, where both the LHS 
> > > and the RHS of the assignment use the new kind of expression?  And do 
> > > those .c, when the use the new kind of expression, refer to the inner or 
> > > outer struct definition?
> > 
> > I would treat this is one integrated feature. Essentially .c is
> > somthing like this->c for the current struct for designated
> > initializer *and* size expressions because it is semantically 
> > so close.    In the initializer I would allow only 
> > the current use for designated initialization for all names of
> > member of the currently initialized struct,  so .c = .c would 
> > be invalid.   It should never refer to the outer struct if there
> 
> I'm not clear on what the intended disambiguation rule here is, when "." 
> is seen in initializer list context - does this rule depend on whether the 
> following identifier is a member of the struct being initialized, so 
> ".c=.c" would be OK above if the initialized struct didn't have a member 
> called c but the outer struct definition did? 

When initializers are parsed it is already clear what
the names of the members of the inner struct are, so
one can differentiate between designated initializers 
and potential other uses in an expression. 

So the main rule is: if you parse .something in a context
where a designator is allowed and "something" is a member
of the current struct, then it is a designator.

So for 

struct foo { int c; int buf[(struct { int d; }){ .d = .c }]; };

one knows during parsing that the .d is a designator
and that .c is not. For

struct foo { int c; int buf[(struct { int d; }){ .c = .c }]; };

one knows that both uses of .c are not.

Whether these different use cases should be allowed or not
is a different question, but my point is that there does
not seem to be a problem directly identifying the uses 
as a designator as usual. To me, this seems to imply that
it is safe to use the same syntax.

>  That seems like a rather 
> messy rule.  And does "would allow only" apply other than in the ambiguous 
> context?  That seems to be implied by ".c=.c" being invalid above, because 
> to make it invalid you need to disallow the new construct being used for 
> the second .c, not just make the first .c interpreted as a designator.

Yes. 
> 
> Again, this sort of thing needs a detailed written specification, with 
> multiple iterations discussed among different implementations. 

Oh, I agree with this.

>  The above 
> paragraph doesn't make clear to me any of: the disambiguation rules; what 
> is allowed in what context; how name lookup works (consider tricky cases 
> such as a reference to an identifier declared *later* in the same struct, 
> possibly in the context of C2x tag compatibility where a previous 
> definition of the struct is visible); when these expressions get 
> evaluated; what the underlying principles are behind those choices.

I also agree that all this needs careful consideration and written
rules.  My point is mereley that there does not seem to be a
fundamental issue differentiating the new feature from 
designators during parsing, so there may not be a risk using 
the same syntax.

> Using a token (existing or new) other than '.' - one that doesn't 
> introduce ambiguity in any context where expressions can be used - would 
> help significantly, although some of the issues would still apply.

The cost of using a new symbol is that one has two different
syntax for something which is semantically equivalent, i.e.
a notion to refer to a member of the current struct.

Martin

>
Qing Zhao June 16, 2023, 8:20 p.m. UTC | #19
> On Jun 16, 2023, at 1:07 PM, Martin Uecker <uecker@tugraz.at> wrote:
> 
> Am Freitag, dem 16.06.2023 um 16:21 +0000 schrieb Joseph Myers:
>> On Fri, 16 Jun 2023, Martin Uecker via Gcc-patches wrote:
>> 
>>>> Note that no expressions can start with the '.' token at present.  As soon 
>>>> as you invent a new kind of expression that can start with that token, you 
>>>> have syntactic ambiguity.
>>>> 
>>>> struct s1 { int c; char a[(struct s2 { int c; char b[.c]; }) {.c=.c}.c]; };
>>>> 
>>>> Is ".c=.c" a use of the existing syntax for designated initializers, with 
>>>> the first ".c" being a designator and the second being a use of the new 
>>>> kind of expression, or is it an assignment expression, where both the LHS 
>>>> and the RHS of the assignment use the new kind of expression?  And do 
>>>> those .c, when the use the new kind of expression, refer to the inner or 
>>>> outer struct definition?
>>> 
>>> I would treat this is one integrated feature. Essentially .c is
>>> somthing like this->c for the current struct for designated
>>> initializer *and* size expressions because it is semantically 
>>> so close.    In the initializer I would allow only 
>>> the current use for designated initialization for all names of
>>> member of the currently initialized struct,  so .c = .c would 
>>> be invalid.   It should never refer to the outer struct if there
>> 
>> I'm not clear on what the intended disambiguation rule here is, when "." 
>> is seen in initializer list context - does this rule depend on whether the 
>> following identifier is a member of the struct being initialized, so 
>> ".c=.c" would be OK above if the initialized struct didn't have a member 
>> called c but the outer struct definition did? 
> 
> When initializers are parsed it is already clear what
> the names of the members of the inner struct are, so
> one can differentiate between designated initializers 
> and potential other uses in an expression. 
> 
> So the main rule is: if you parse .something in a context
> where a designator is allowed and "something" is a member
> of the current struct, then it is a designator.

So, Limiting the .something ONLY to the CURRENT structure/union might be the simple and clean rule.

And I guess that this is also the rule for the current designator initializer syntax in C99?

> 
> So for 
> 
> struct foo { int c; int buf[(struct { int d; }){ .d = .c }]; };
> 
> one knows during parsing that the .d is a designator
> and that .c is not.

Therefore, the above should be invalid based on this rule since .c is not a member in the current structure.


> For
> 
> struct foo { int c; int buf[(struct { int d; }){ .c = .c }]; };
> 
> one knows that both uses of .c are not.

And this also is invalid since .c is not to a member in the current structure. 

> 
> Whether these different use cases should be allowed or not
> is a different question, but my point is that there does
> not seem to be a problem directly identifying the uses 
> as a designator as usual. To me, this seems to imply that
> it is safe to use the same syntax.
> 
>> That seems like a rather 
>> messy rule.  And does "would allow only" apply other than in the ambiguous 
>> context?  That seems to be implied by ".c=.c" being invalid above, because 
>> to make it invalid you need to disallow the new construct being used for 
>> the second .c, not just make the first .c interpreted as a designator.
> 
> Yes. 
>> 
>> Again, this sort of thing needs a detailed written specification, with 
>> multiple iterations discussed among different implementations. 
> 
> Oh, I agree with this.
> 
>> The above 
>> paragraph doesn't make clear to me any of: the disambiguation rules; what 
>> is allowed in what context; how name lookup works (consider tricky cases 
>> such as a reference to an identifier declared *later* in the same struct, 
>> possibly in the context of C2x tag compatibility where a previous 
>> definition of the struct is visible); when these expressions get 
>> evaluated; what the underlying principles are behind those choices.
> 
> I also agree that all this needs careful consideration and written
> rules.  My point is mereley that there does not seem to be a
> fundamental issue differentiating the new feature from 
> designators during parsing, so there may not be a risk using 
> the same syntax.

Yes, I agree on this. 
Extending the existing designated initializer syntax, .member, for the purpose of the argument of the new attribute seems very natural. 
If we can use this syntax in the argument of this new attribute, 
1. it will be easy to extend the argument of this attribute to an expression. 
2. It will also easy to use this syntax later if we accept the following

struct foo {
    ...
    unsigned int count;
    ...
    int data[.count];
};


thanks.

Qing
> 
>> Using a token (existing or new) other than '.' - one that doesn't 
>> introduce ambiguity in any context where expressions can be used - would 
>> help significantly, although some of the issues would still apply.
> 
> The cost of using a new symbol is that one has two different
> syntax for something which is semantically equivalent, i.e.
> a notion to refer to a member of the current struct.
> 
> Martin
Joseph Myers June 16, 2023, 9:35 p.m. UTC | #20
On Fri, 16 Jun 2023, Qing Zhao via Gcc-patches wrote:

> > So for 
> > 
> > struct foo { int c; int buf[(struct { int d; }){ .d = .c }]; };
> > 
> > one knows during parsing that the .d is a designator
> > and that .c is not.
> 
> Therefore, the above should be invalid based on this rule since .c is 
> not a member in the current structure.

What do you mean by "current structure"?  I think two different concepts 
are being conflated: the structure *being initialized* (what the C 
standard calls the "current object" for a brace-enclosed initializer 
list), and the structure *being defined*.  The former is what's relevant 
for designators.  The latter is what's relevant for the suggested new 
syntax.  And .c *is* a member of the structure being defined in this 
example.

Those two structure types are always different, except for corner cases 
with C2x tag compatibility (where an object of structure type might be 
initialized in the middle of a redefinition of that type).
Qing Zhao June 20, 2023, 7:40 p.m. UTC | #21
> On Jun 16, 2023, at 5:35 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> 
> On Fri, 16 Jun 2023, Qing Zhao via Gcc-patches wrote:
> 
>>> So for 
>>> 
>>> struct foo { int c; int buf[(struct { int d; }){ .d = .c }]; };
>>> 
>>> one knows during parsing that the .d is a designator
>>> and that .c is not.
>> 
>> Therefore, the above should be invalid based on this rule since .c is 
>> not a member in the current structure.
> 
> What do you mean by "current structure"?  I think two different concepts 
> are being conflated: the structure *being initialized* (what the C 
> standard calls the "current object" for a brace-enclosed initializer 
> list),

I think the concept of “current structure” should be stick to this. 

> and the structure *being defined*.
Not this.

(Forgive me about my poor English -:)).

Then it will be cleaner? 

What’s your opinion?


>  The former is what's relevant 
> for designators.  The latter is what's relevant for the suggested new 
> syntax.  And .c *is* a member of the structure being defined in this 
> example.
> 
> Those two structure types are always different, except for corner cases 
> with C2x tag compatibility (where an object of structure type might be 
> initialized in the middle of a redefinition of that type).

Can you give an example on this?  Thanks.

Qing
> 
> -- 
> Joseph S. Myers
> joseph@codesourcery.com
Qing Zhao June 27, 2023, 3:44 p.m. UTC | #22
Hi,

Based on the discussion so far and further consideration, the following is my plan for this new attribute:

1.  The syntax of the new attribute will be:

__attribute__((counted_by (count_field_id)));

In the above, count_field_id is the identifier for the field that carries the number 
of elements info in the same structure of the FAM. 

For example:

struct object {
..
size_t count:  /* carries the number of elements info for the FAM flex.  */
int flex[] __attribute__((counted_by (count)));
};

2.  Later, if the argument of the this attribute need to be extended to an expression, we might need to 
extend the C FE to accept ".count”  in the future. 

Let me know if you have further comments and suggestions.

thanks.

Qing

> On Jun 20, 2023, at 3:40 PM, Qing Zhao via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> 
> 
> 
>> On Jun 16, 2023, at 5:35 PM, Joseph Myers <joseph@codesourcery.com> wrote:
>> 
>> On Fri, 16 Jun 2023, Qing Zhao via Gcc-patches wrote:
>> 
>>>> So for 
>>>> 
>>>> struct foo { int c; int buf[(struct { int d; }){ .d = .c }]; };
>>>> 
>>>> one knows during parsing that the .d is a designator
>>>> and that .c is not.
>>> 
>>> Therefore, the above should be invalid based on this rule since .c is 
>>> not a member in the current structure.
>> 
>> What do you mean by "current structure"?  I think two different concepts 
>> are being conflated: the structure *being initialized* (what the C 
>> standard calls the "current object" for a brace-enclosed initializer 
>> list),
> 
> I think the concept of “current structure” should be stick to this. 
> 
>> and the structure *being defined*.
> Not this.
> 
> (Forgive me about my poor English -:)).
> 
> Then it will be cleaner? 
> 
> What’s your opinion?
> 
> 
>> The former is what's relevant 
>> for designators.  The latter is what's relevant for the suggested new 
>> syntax.  And .c *is* a member of the structure being defined in this 
>> example.
>> 
>> Those two structure types are always different, except for corner cases 
>> with C2x tag compatibility (where an object of structure type might be 
>> initialized in the middle of a redefinition of that type).
> 
> Can you give an example on this?  Thanks.
> 
> Qing
>> 
>> -- 
>> Joseph S. Myers
>> joseph@codesourcery.com
diff mbox series

Patch

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 072cfb69147..d45d11077c3 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -103,6 +103,8 @@  static tree handle_warn_if_not_aligned_attribute (tree *, tree, tree,
 						  int, bool *);
 static tree handle_strict_flex_array_attribute (tree *, tree, tree,
 						 int, bool *);
+static tree handle_element_count_attribute (tree *, tree, tree,
+					   int, bool *);
 static tree handle_weak_attribute (tree *, tree, tree, int, bool *) ;
 static tree handle_noplt_attribute (tree *, tree, tree, int, bool *) ;
 static tree handle_alias_ifunc_attribute (bool, tree *, tree, tree, bool *);
@@ -373,6 +375,8 @@  const struct attribute_spec c_common_attribute_table[] =
 			      handle_warn_if_not_aligned_attribute, NULL },
   { "strict_flex_array",      1, 1, true, false, false, false,
 			      handle_strict_flex_array_attribute, NULL },
+  { "element_count",	      1, 1, true, false, false, false,
+			      handle_element_count_attribute, NULL },
   { "weak",                   0, 0, true,  false, false, false,
 			      handle_weak_attribute, NULL },
   { "noplt",                   0, 0, true,  false, false, false,
@@ -2555,6 +2559,53 @@  handle_strict_flex_array_attribute (tree *node, tree name,
   return NULL_TREE;
 }
 
+/* Handle a "element_count" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_element_count_attribute (tree *node, tree name,
+				tree args, int ARG_UNUSED (flags),
+				bool *no_add_attrs)
+{
+  tree decl = *node;
+  tree argval = TREE_VALUE (args);
+
+  /* This attribute only applies to field decls of a structure.  */
+  if (TREE_CODE (decl) != FIELD_DECL)
+    {
+      error_at (DECL_SOURCE_LOCATION (decl),
+		"%qE attribute may not be specified for non-field"
+		" declaration %q+D", name, decl);
+      *no_add_attrs = true;
+    }
+  /* This attribute only applies to field with array type.  */
+  else if (TREE_CODE (TREE_TYPE (decl)) != ARRAY_TYPE)
+    {
+      error_at (DECL_SOURCE_LOCATION (decl),
+		"%qE attribute may not be specified for a non-array field",
+		name);
+      *no_add_attrs = true;
+    }
+  /* This attribute only applies to a C99 flexible array member type.  */
+  else if (! c_flexible_array_member_type_p (TREE_TYPE (decl)))
+    {
+      error_at (DECL_SOURCE_LOCATION (decl),
+		"%qE attribute may not be specified for a non"
+		" flexible array member field",
+		name);
+      *no_add_attrs = true;
+    }
+  /* The argument of the attribute should be a string.  */
+  else if (TREE_CODE (argval) != STRING_CST)
+    {
+      error_at (DECL_SOURCE_LOCATION (decl),
+		"%qE attribute argument not a string", name);
+      *no_add_attrs = true;
+    }
+
+  return NULL_TREE;
+}
+
 /* Handle a "weak" attribute; arguments as in
    struct attribute_spec.handler.  */
 
diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 2b4c82facf7..b2bf7a230e2 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -9516,6 +9516,19 @@  c_common_finalize_early_debug (void)
       (*debug_hooks->early_global_decl) (cnode->decl);
 }
 
+/* Determine whether TYPE is a ISO C99 flexible array memeber type "[]".  */
+bool
+c_flexible_array_member_type_p (const_tree type)
+{
+  if (TREE_CODE (type) == ARRAY_TYPE
+      && TYPE_SIZE (type) == NULL_TREE
+      && TYPE_DOMAIN (type) != NULL_TREE
+      && TYPE_MAX_VALUE (TYPE_DOMAIN (type)) == NULL_TREE)
+    return true;
+
+  return false;
+}
+
 /* Get the LEVEL of the strict_flex_array for the ARRAY_FIELD based on the
    values of attribute strict_flex_array and the flag_strict_flex_arrays.  */
 unsigned int
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index f96350b64af..77f6633b862 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -907,6 +907,7 @@  extern tree fold_for_warn (tree);
 extern tree c_common_get_narrower (tree, int *);
 extern bool get_attribute_operand (tree, unsigned HOST_WIDE_INT *);
 extern void c_common_finalize_early_debug (void);
+extern bool c_flexible_array_member_type_p (const_tree);
 extern unsigned int c_strict_flex_array_level_of (tree);
 extern bool c_option_is_from_cpp_diagnostics (int);
 
diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index f8ede362bfd..0f25fe0be0d 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -5187,19 +5187,6 @@  set_array_declarator_inner (struct c_declarator *decl,
   return decl;
 }
 
-/* Determine whether TYPE is a ISO C99 flexible array memeber type "[]".  */
-static bool
-flexible_array_member_type_p (const_tree type)
-{
-  if (TREE_CODE (type) == ARRAY_TYPE
-      && TYPE_SIZE (type) == NULL_TREE
-      && TYPE_DOMAIN (type) != NULL_TREE
-      && TYPE_MAX_VALUE (TYPE_DOMAIN (type)) == NULL_TREE)
-    return true;
-
-  return false;
-}
-
 /* Determine whether TYPE is a one-element array type "[1]".  */
 static bool
 one_element_array_type_p (const_tree type)
@@ -5236,7 +5223,7 @@  add_flexible_array_elts_to_size (tree decl, tree init)
 
   elt = CONSTRUCTOR_ELTS (init)->last ().value;
   type = TREE_TYPE (elt);
-  if (flexible_array_member_type_p (type))
+  if (c_flexible_array_member_type_p (type))
     {
       complete_array_type (&type, elt, false);
       DECL_SIZE (decl)
@@ -9087,7 +9074,7 @@  is_flexible_array_member_p (bool is_last_field,
 
   bool is_zero_length_array = zero_length_array_type_p (TREE_TYPE (x));
   bool is_one_element_array = one_element_array_type_p (TREE_TYPE (x));
-  bool is_flexible_array = flexible_array_member_type_p (TREE_TYPE (x));
+  bool is_flexible_array = c_flexible_array_member_type_p (TREE_TYPE (x));
 
   unsigned int strict_flex_array_level = c_strict_flex_array_level_of (x);
 
@@ -9117,6 +9104,45 @@  is_flexible_array_member_p (bool is_last_field,
   return false;
 }
 
+/* Verify the argument of the element_count attribute of the flexible array
+   member FIELD_DECL is a valid field of the containing structure's fieldlist,
+   FIELDLIST, Report error when it's not.  */
+static void
+verify_element_count_attribute (tree fieldlist, tree field_decl)
+{
+  tree attr_element_count = lookup_attribute ("element_count",
+					      DECL_ATTRIBUTES (field_decl));
+
+  if (!attr_element_count)
+    return;
+
+  /* If there is an element_count attribute attached to the field,
+     verify it.  */
+
+  const char *fieldname
+    = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr_element_count)));
+
+  /* Verify the argument of the attrbute is a valid field of the
+     containing structure.  */
+
+  tree element_count_field = NULL_TREE;
+  for (tree field = fieldlist; field; field = DECL_CHAIN (field))
+    if (TREE_CODE (field) == FIELD_DECL
+	&& DECL_NAME (field) != NULL
+	&& strcmp (IDENTIFIER_POINTER (DECL_NAME (field)), fieldname) == 0)
+      {
+	element_count_field = field;
+	break;
+      }
+
+  /* Error when the field is not found in the containing structure.  */
+  if (!element_count_field)
+      error_at (DECL_SOURCE_LOCATION (field_decl),
+		"%qE attribute argument not a field declaration"
+		" in the same structure",
+		(get_attribute_name (attr_element_count)));
+  return;
+}
 
 /* Fill in the fields of a RECORD_TYPE or UNION_TYPE node, T.
    LOC is the location of the RECORD_TYPE or UNION_TYPE's definition.
@@ -9237,7 +9263,7 @@  finish_struct (location_t loc, tree t, tree fieldlist, tree attributes,
 	DECL_PACKED (x) = 1;
 
       /* Detect flexible array member in an invalid context.  */
-      if (flexible_array_member_type_p (TREE_TYPE (x)))
+      if (c_flexible_array_member_type_p (TREE_TYPE (x)))
 	{
 	  if (TREE_CODE (t) == UNION_TYPE)
 	    {
@@ -9258,6 +9284,9 @@  finish_struct (location_t loc, tree t, tree fieldlist, tree attributes,
 			"members");
 	      TREE_TYPE (x) = error_mark_node;
 	    }
+	  /* if there is an element_count attribute attached to this field,
+	     verify it.  */
+	  verify_element_count_attribute (fieldlist, x);
 	}
 
       if (pedantic && TREE_CODE (t) == RECORD_TYPE
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 69b21a75e62..cdbe3923ef1 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -7501,6 +7501,27 @@  When both the attribute and the option present at the same time, the level of
 the strictness for the specific trailing array field is determined by the
 attribute.
 
+@cindex @code{element_count} variable attribute
+@item element_count ("@var{count}")
+The @code{element_count} attribute may be attached to the flexible array
+member of a structure.  It indicates that the number of the elements of the
+array is given by the field named "@var{count}" in the same structure as the
+flexible array member.  GCC uses this information to improve the results of
+@code{__builtin_dynamic_object_size} and array bound sanitizer.
+
+For instance, the following declaration:
+
+@smallexample
+struct P @{
+  size_t count;
+  int array[] __attribute__ ((element_count ("count")));
+@};
+@end smallexample
+
+@noindent
+specify that @code{array} is a flexible array member whose number of element
+is given by the field "@code{count}" in the same structure.
+
 @cindex @code{alloc_size} variable attribute
 @item alloc_size (@var{position})
 @itemx alloc_size (@var{position-1}, @var{position-2})
diff --git a/gcc/testsuite/gcc.dg/flex-array-element-count.c b/gcc/testsuite/gcc.dg/flex-array-element-count.c
new file mode 100644
index 00000000000..988f41e9f5e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-element-count.c
@@ -0,0 +1,27 @@ 
+/* testing the correct usage of attribute element_count.  */   
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int size;
+int x __attribute ((element_count ("size"))); /* { dg-error "attribute may not be specified for non-field declaration" } */
+
+struct trailing {
+  int count;
+  int field __attribute ((element_count ("count"))); /* { dg-error "attribute may not be specified for a non-array field" } */
+};
+
+struct trailing_1 {
+  int count;
+  int array_1[0] __attribute ((element_count ("count"))); /* { dg-error "attribute may not be specified for a non flexible array member field" } */
+};
+
+int count;
+struct trailing_array_2 {
+  int count;
+  int array_2[] __attribute ((element_count (count))); /* { dg-error "attribute argument not a string" } */
+};
+
+struct trailing_array_3 {
+  int other;
+  int array_3[] __attribute ((element_count ("count"))); /* { dg-error "attribute argument not a field declaration in the same structure" } */
+};