diff mbox

[C] Fix pr44517

Message ID 4C21C7EE.4060805@oracle.com
State New
Headers show

Commit Message

Shujing Zhao June 23, 2010, 8:38 a.m. UTC
On 06/22/2010 08:55 PM, Joseph S. Myers wrote:
> On Tue, 22 Jun 2010, Shujing Zhao wrote:
> 
>> Hi,
>>
>> This patch tries to improve diagnostic for wrong type name in function
>> declaration. It also changes the algorithm of function
>> c_parser_parms_list_declarator. If one of parameter declaration is wrong,
>> c_parser_parms_list_declarator would return NULL, not an empty parameter
>> information struct. A test case is added to test this change.
> 
> You are changing the semantics of the variable good_parm from meaning 
> "there was at least one good parameter" to "there were no bad parameters".  
> Now, such a change should be accompanied by a change of name (e.g. to 
> bad_parm, also reversing the sense of the variable).
> 
> There was a previous invariant that get_pending_sizes would be called 
> after any parameters were parsed, either directly or via get_parm_info 
> because good_parm would be set, and with this patch, this invariant is no 
> longer maintained.  This is unsafe; you need to run this cleanup, whatever 
> you then return from c_parser_parms_list_declarator.  Consider for example 
> the testcase:
> 
> void foo(int n, int a[n], pid_t x);
> void bar() {}
> 
> (related to gcc.dg/noncompile/pr35444-*.c).  With your patch, this 
> testcase (which should be added to the next revision of the patch) gets an 
> ICE, because you lost the call to get_pending_sizes via get_parm_info.
> 
You are right! Yes, If append the above test case to the end of pr44517.c, it 
gets an ICE.
The above problem is fixed at the updated patch and the test case is moved to 
the directory gcc.dg/noncompile/.

>> +      error ("unknown type name %qE", token->value);
> 
> I don't think %qE is appropriate here if the token is not an identifier.
> 
Yes. But the problem is that token->id_kind is C_ID_ID and token->value is 
IDENTIFIER_NODE. At function c_lex_one_token, the token->id_kind is always be 
set to C_ID_ID if it is not the other identifier. Look at enum c_id_kind, 
C_ID_ID is "an ordinary identifier" and there is an "not an identifier" 
C_ID_NONE, but it never be really set. If token is CPP_NAME, and it was not 
declared as some type name, the token->id_kind should be set C_ID_NONE. But 
where should C_ID_ID be set?
I think that is the problem of c_lex_one_token, not the message format. I can't 
give a solution for that issue now. Does this patch can be committed firstly?

Retested on i686-pc-linux-gnu. Is it ok?

Thanks
Pearly
gcc/
2010-06-23  Shujing Zhao  <pearly.zhao@oracle.com>

	PR c/44517
	* c-parser.c (c_parser_parms_list_declarator): Return NULL if one of
	parameters are not good.
	(c_parser_parameter_declaration): Error unknown type name if the
	parameter can't start declaration specifiers.

gcc/testsuite/
2010-06-23  Shujing Zhao  <pearly.zhao@oracle.com>

	PR c/44517
	* gcc.dg/noncompile/pr44517.c: New.
	* gcc.dg/noncompile/990416-1.c: Adjust expected error.

Comments

Manuel López-Ibáñez June 23, 2010, 8:56 a.m. UTC | #1
Nice improvement! Some unimportant comments:

@@ -2843,8 +2828,12 @@ c_parser_parameter_declaration (c_parser
     {
       /* ??? In some Objective-C cases '...' isn't applicable so there
 	 should be a different message.  */
-      c_parser_error (parser,
-		      "expected declaration specifiers or %<...%>");
+      c_token *token = c_parser_peek_token (parser);
+      if (parser->error)
+	return NULL;
+      parser->error = true;
+      c_parser_set_source_position_from_token (token);
+      error ("unknown type name %qE", token->value);
       c_parser_skip_to_end_of_parameter (parser);
       return NULL;
     }

The ??? comment does not make sense anymore (and without any example
wasn't very useful to start with), I would propose to remove it. BTW,
do you know that there is a error_at (LOCATION) function? I am not
sure whether here using that is more correct (or efficient) than
c_parser_set_source_position_from_token but just to let you know for
future patches.


+/* PR c/44517: Improve diagnostic for misspelled typename in function
declaration. */
+int foo(int x, pid_t y, long z, in t) {  /* { dg-error "unknown type
name.*pid_t|unknown type name.*in" } */
+  return x + y + z + t;
+}
+
+int bar(int x, lon y, long z, ...){ /* { dg-error "unknown type name"
{ target *-*-* } 8 } */
+  return;
+}
+
+void foo(int n, int a[n], pid_t x); /* { dg-error "unknown type name"
{ target *-*-* } 12 } */
+void bar() {};

What is the default dg action for tests without dg-do directive? Also,
I am not sure how these directives are working correctly, because the
complete format is:

{ dg-error PATTERN COMMENT { TARGET_SPEC } LINE }

so the comment is missing. As far as I know, you can drop arguments
from right to left but not, for example, put a LINE and not a TARGET
or COMMENT. (Unfortunate, because LINE is probably the second most
used after PATTERN). BTW, you do not need LINE if the error is given
in the same line as the directive dg-error.

Cheers,

Manuel.
Joseph Myers June 23, 2010, 11:45 a.m. UTC | #2
On Wed, 23 Jun 2010, Shujing Zhao wrote:

> > > +      error ("unknown type name %qE", token->value);
> > 
> > I don't think %qE is appropriate here if the token is not an identifier.
> > 
> Yes. But the problem is that token->id_kind is C_ID_ID and token->value is
> IDENTIFIER_NODE. At function c_lex_one_token, the token->id_kind is always be
> set to C_ID_ID if it is not the other identifier. Look at enum c_id_kind,
> C_ID_ID is "an ordinary identifier" and there is an "not an identifier"
> C_ID_NONE, but it never be really set. If token is CPP_NAME, and it was not
> declared as some type name, the token->id_kind should be set C_ID_NONE. But
> where should C_ID_ID be set?
> I think that is the problem of c_lex_one_token, not the message format. I
> can't give a solution for that issue now. Does this patch can be committed
> firstly?

I don't understand what you are saying.  C_ID_NONE is for tokens that are 
not CPP_NAME at all.  C_ID_ID is for a subset of CPP_NAME tokens.

Tokens that are not CPP_NAME at all can have many different sorts of trees 
for token->value, which may not be appropriate for %qE.  They may use 
NULL_TREE if no token value is needed at all.  For example, the following 
test, which should be added to the next patch revision, segfaults with 
your latest patch applied because you are inappropriately using 
token->value when it is NULL.  As I said, use it *only* for tokens that 
are some kind of identifier (CPP_NAME or CPP_KEYWORD); include testcases 
for both CPP_NAME and CPP_KEYWORD and for other kinds of token.

void f(int a, *b);

> +/* PR c/44517: Improve diagnostic for misspelled typename in function declaration. */
> +int foo(int x, pid_t y, long z, in t) {  /* { dg-error "unknown type name 'pid_t'|unknown type name 'in'" } */
> +  return x + y + z + t;
> +}
> +
> +int bar(int x, lon y, long z, ...){ /* { dg-error "unknown type name 'lon'" } */
> +  return;
> +}
> +
> +void foo(int n, int a[n], pid_t x); /* { dg-error "unknown type name 'pid_t'" } */
> +void bar() {};

Use four different names for the four different functions.  Remove the 
stray semicolon after your last function body.
Manuel López-Ibáñez June 23, 2010, 12:01 p.m. UTC | #3
On 23 June 2010 13:45, Joseph S. Myers <joseph@codesourcery.com> wrote:
> token->value when it is NULL.  As I said, use it *only* for tokens that
> are some kind of identifier (CPP_NAME or CPP_KEYWORD); include testcases
> for both CPP_NAME and CPP_KEYWORD and for other kinds of token.
>
> void f(int a, *b);

What other types of tokens can be found at this point? How to print
them properly?

I guess for other types of tokens the "unknown type" message may not
be appropriate at all. Perhaps for those cases, the old message:

-      c_parser_error (parser,
-                     "expected declaration specifiers or %<...%>");

would still be appropriate.

>> +/* PR c/44517: Improve diagnostic for misspelled typename in function declaration. */
>> +int foo(int x, pid_t y, long z, in t) {  /* { dg-error "unknown type name 'pid_t'|unknown type name 'in'" } */
>> +  return x + y + z + t;
>> +}
>> +
>> +int bar(int x, lon y, long z, ...){ /* { dg-error "unknown type name 'lon'" } */
>> +  return;
>> +}
>> +
>> +void foo(int n, int a[n], pid_t x); /* { dg-error "unknown type name 'pid_t'" } */
>> +void bar() {};
>
> Use four different names for the four different functions.  Remove the
> stray semicolon after your last function body.

How can this test pass then? Perhaps it needs an explicit /* { dg-do
compile } */ ?

Manuel.
Joseph Myers June 23, 2010, 12:23 p.m. UTC | #4
On Wed, 23 Jun 2010, Manuel López-Ibáñez wrote:

> On 23 June 2010 13:45, Joseph S. Myers <joseph@codesourcery.com> wrote:
> > token->value when it is NULL.  As I said, use it *only* for tokens that
> > are some kind of identifier (CPP_NAME or CPP_KEYWORD); include testcases
> > for both CPP_NAME and CPP_KEYWORD and for other kinds of token.
> >
> > void f(int a, *b);
> 
> What other types of tokens can be found at this point?

That is a matter for the patch submitter to assess and explain as part of 
justifying the patch; understanding the circumstances in which particular 
code is reached is an important part of patching the parser.

> How to print them properly?

c_parse_error deals with printing different kinds of tokens, although not 
necessarily optimally.

> I guess for other types of tokens the "unknown type" message may not
> be appropriate at all. Perhaps for those cases, the old message:
> 
> -      c_parser_error (parser,
> -                     "expected declaration specifiers or %<...%>");
> 
> would still be appropriate.

That might also be the case for keywords; saying that e.g. "goto" is an 
unknown type may not be appropriate.

> >> +/* PR c/44517: Improve diagnostic for misspelled typename in function declaration. */
> >> +int foo(int x, pid_t y, long z, in t) {  /* { dg-error "unknown type name 'pid_t'|unknown type name 'in'" } */
> >> +  return x + y + z + t;
> >> +}
> >> +
> >> +int bar(int x, lon y, long z, ...){ /* { dg-error "unknown type name 'lon'" } */
> >> +  return;
> >> +}
> >> +
> >> +void foo(int n, int a[n], pid_t x); /* { dg-error "unknown type name 'pid_t'" } */
> >> +void bar() {};
> >
> > Use four different names for the four different functions.  Remove the
> > stray semicolon after your last function body.
> 
> How can this test pass then? Perhaps it needs an explicit /* { dg-do
> compile } */ ?

No, that's the default.  Probably the details of how syntactically 
erroneous function declarations are handled mean that conflicts do not end 
up being reported for conflicting types or multiple definitions (if one 
declaration cannot be parsed, it may not be possible to get a useful idea 
of whether the types conflict or not).  But since the point of this test 
is not the details of whether duplicate declarations are diagnosed when 
those declarations are erroneous in other ways, different names should be 
used so the test doesn't depend on those details.
diff mbox

Patch

Index: c-parser.c
===================================================================
--- c-parser.c	(revision 160889)
+++ c-parser.c	(working copy)
@@ -2706,7 +2706,7 @@  c_parser_parms_declarator (c_parser *par
 static struct c_arg_info *
 c_parser_parms_list_declarator (c_parser *parser, tree attrs)
 {
-  bool good_parm = false;
+  bool bad_parm = false;
   /* ??? Following the old parser, forward parameter declarations may
      use abstract declarators, and if no real parameter declarations
      follow the forward declarations then this is not diagnosed.  Also
@@ -2758,11 +2758,10 @@  c_parser_parms_list_declarator (c_parser
       /* Parse a parameter.  */
       struct c_parm *parm = c_parser_parameter_declaration (parser, attrs);
       attrs = NULL_TREE;
-      if (parm != NULL)
-	{
-	  good_parm = true;
-	  push_parm_decl (parm);
-	}
+      if (parm == NULL)
+	bad_parm = true;
+      else
+	push_parm_decl (parm);
       if (c_parser_next_token_is (parser, CPP_SEMICOLON))
 	{
 	  tree new_attrs;
@@ -2774,20 +2773,13 @@  c_parser_parms_list_declarator (c_parser
       if (c_parser_next_token_is (parser, CPP_CLOSE_PAREN))
 	{
 	  c_parser_consume_token (parser);
-	  if (good_parm)
-	    return get_parm_info (false);
-	  else
+	  if (bad_parm)
 	    {
-	      struct c_arg_info *ret
-		= XOBNEW (&parser_obstack, struct c_arg_info);
-	      ret->parms = 0;
-	      ret->tags = 0;
-	      ret->types = 0;
-	      ret->others = 0;
-	      ret->pending_sizes = 0;
-	      ret->had_vla_unspec = 0;
-	      return ret;
+	      get_pending_sizes ();
+	      return NULL;
 	    }
+	  else
+	    return get_parm_info (false);
 	}
       if (!c_parser_require (parser, CPP_COMMA,
 			     "expected %<;%>, %<,%> or %<)%>"))
@@ -2802,20 +2794,13 @@  c_parser_parms_list_declarator (c_parser
 	  if (c_parser_next_token_is (parser, CPP_CLOSE_PAREN))
 	    {
 	      c_parser_consume_token (parser);
-	      if (good_parm)
-		return get_parm_info (true);
-	      else
+	      if (bad_parm)
 		{
-		  struct c_arg_info *ret
-		    = XOBNEW (&parser_obstack, struct c_arg_info);
-		  ret->parms = 0;
-		  ret->tags = 0;
-		  ret->types = 0;
-		  ret->others = 0;
-		  ret->pending_sizes = 0;
-		  ret->had_vla_unspec = 0;
-		  return ret;
+		  get_pending_sizes ();
+		  return NULL;
 		}
+	      else
+		return get_parm_info (true);
 	    }
 	  else
 	    {
@@ -2843,8 +2828,12 @@  c_parser_parameter_declaration (c_parser
     {
       /* ??? In some Objective-C cases '...' isn't applicable so there
 	 should be a different message.  */
-      c_parser_error (parser,
-		      "expected declaration specifiers or %<...%>");
+      c_token *token = c_parser_peek_token (parser);
+      if (parser->error)
+	return NULL;
+      parser->error = true;
+      c_parser_set_source_position_from_token (token);
+      error ("unknown type name %qE", token->value);
       c_parser_skip_to_end_of_parameter (parser);
       return NULL;
     }
Index: testsuite/gcc.dg/noncompile/pr44517.c
===================================================================
--- testsuite/gcc.dg/noncompile/pr44517.c	(revision 0)
+++ testsuite/gcc.dg/noncompile/pr44517.c	(revision 0)
@@ -0,0 +1,11 @@ 
+/* PR c/44517: Improve diagnostic for misspelled typename in function declaration. */
+int foo(int x, pid_t y, long z, in t) {  /* { dg-error "unknown type name.*pid_t|unknown type name.*in" } */
+  return x + y + z + t;
+}
+
+int bar(int x, lon y, long z, ...){ /* { dg-error "unknown type name" { target *-*-* } 8 } */
+  return;
+}
+
+void foo(int n, int a[n], pid_t x); /* { dg-error "unknown type name" { target *-*-* } 12 } */
+void bar() {};
Index: testsuite/gcc.dg/noncompile/990416-1.c
===================================================================
--- testsuite/gcc.dg/noncompile/990416-1.c	(revision 160889)
+++ testsuite/gcc.dg/noncompile/990416-1.c	(working copy)
@@ -2,11 +2,11 @@  extern void *memcpy (void *, const void 
 typedef int word_type;
    
 static void
-copy_reg (unsigned int reg, frame_state *udata,	/* { dg-error "parse|syntax|expected" } */
-	  frame_state *target_udata)	/* { dg-error "expected" } */
+copy_reg (unsigned int reg, frame_state *udata,	/* { dg-error "unknown type name" } */
+	  frame_state *target_udata)	/* { dg-error "unknown type name" } */
 {  
-  word_type *preg = get_reg_addr (reg, udata, 0);	/* { dg-error "undeclared|function|without a cast" } */
-  word_type *ptreg = get_reg_addr (reg, target_udata, 0); /* { dg-error "undeclared|without a cast" } */
+  word_type *preg = ge_reg_addr (reg, udata, 0);
+  word_type *ptreg = ge_reg_addr (reg, target_udata, 0);
    
   memcpy (ptreg, preg, __builtin_dwarf_reg_size (reg));
 }