Patchwork [c] better recovery from missing semicolon after declaration

login
register
mail settings
Submitter Nathan Froyd
Date Nov. 17, 2010, 3:33 p.m.
Message ID <20101117153343.GY24469@nightcrawler>
Download mbox | patch
Permalink /patch/71576/
State New
Headers show

Comments

Nathan Froyd - Nov. 17, 2010, 3:33 p.m.
On Wed, Nov 17, 2010 at 10:29:29AM -0500, Nathan Froyd wrote:
> This patch attempts to fix that by detecting when we've just parsed a
> declaration and we're faced with what looks like the beginning of the
> next declaration.

I forgot that one testcase needed some adjustments.  The patch below
deletes some dg-error messages since for the opening line of the
testcase:

typedef BYTE unsigned char;	/* { dg-error "expected" } */

we now treat BYTE as a type.  I think this is an improvement, albeit an
unexpected one.

-Nathan
Paolo Bonzini - Nov. 17, 2010, 4:17 p.m.
On 11/17/2010 04:33 PM, Nathan Froyd wrote:
> On Wed, Nov 17, 2010 at 10:29:29AM -0500, Nathan Froyd wrote:
>> This patch attempts to fix that by detecting when we've just parsed a
>> declaration and we're faced with what looks like the beginning of the
>> next declaration.
>
> I forgot that one testcase needed some adjustments.  The patch below
> deletes some dg-error messages since for the opening line of the
> testcase:
>
> typedef BYTE unsigned char;	/* { dg-error "expected" } */
>
> we now treat BYTE as a type.  I think this is an improvement, albeit an
> unexpected one.

Ah, so we parse it as

    typedef /* implied int */ BYTE;
    unsigned char;

though without the extra warnings the above has.  This may be an 
improvement (better parsing recovery is good, but better semantic 
recovery is too!), but I'm not sure about the relatively convoluted way 
in which the parser gets there.

In particular, I'm worried that for example in something like this:

   typedef uintt16_t pid_t;

we would not be able to see the declaration of pid_t as a type.  In 
other words, I'm worried that it becomes harder to improve this tescase:

   typedef unsigned short uint16_t;

   /* should warn about unknown type name "uintt16_t", currently gives
      a parse error ("expected ... before pid_t") because unknown type
      names are not guessed in c_parser_declspecs.  */
   typedef uintt16_t pid_t;

   /* no error expected about unknown type name; currently fails */
   pid_t xyz;

Given my recent experience with adding unknown type name errors, it 
should not be hard to fix the above (just lack of time on my part), for 
example by moving this into c_parser_next_token_starts_typename:

   /* Try a bit harder to detect an unknown typename.  */
   if (token->type == CPP_NAME
       && token->id_kind == C_ID_ID
       && (c_parser_peek_2nd_token (parser)->type == CPP_NAME
           || c_parser_peek_2nd_token (parser)->type == CPP_MULT)
       && !lookup_name (token->value)

       /* Do not try too hard when we could have "object in array".  */
       && !parser->objc_could_be_foreach_context)
     return true;

Maybe you can give this a shot before applying these patches?

Paolo
Nathan Froyd - Nov. 18, 2010, 7:04 p.m.
On Wed, Nov 17, 2010 at 05:17:20PM +0100, Paolo Bonzini wrote:
> On 11/17/2010 04:33 PM, Nathan Froyd wrote:
> >I forgot that one testcase needed some adjustments.  The patch below
> >deletes some dg-error messages since for the opening line of the
> >testcase:
> >
> >typedef BYTE unsigned char;	/* { dg-error "expected" } */
> >
> >we now treat BYTE as a type.  I think this is an improvement, albeit an
> >unexpected one.
> 
> In other words, I'm worried that it becomes harder to improve this
> tescase:
> 
>   typedef unsigned short uint16_t;
> 
>   /* should warn about unknown type name "uintt16_t", currently gives
>      a parse error ("expected ... before pid_t") because unknown type
>      names are not guessed in c_parser_declspecs.  */
>   typedef uintt16_t pid_t;
> 
>   /* no error expected about unknown type name; currently fails */
>   pid_t xyz;
> 
> Given my recent experience with adding unknown type name errors, it
> should not be hard to fix the above (just lack of time on my part),
> for example by moving this into c_parser_next_token_starts_typename:
> 
>   /* Try a bit harder to detect an unknown typename.  */
>   if (token->type == CPP_NAME
>       && token->id_kind == C_ID_ID
>       && (c_parser_peek_2nd_token (parser)->type == CPP_NAME
>           || c_parser_peek_2nd_token (parser)->type == CPP_MULT)
>       && !lookup_name (token->value)
> 
>       /* Do not try too hard when we could have "object in array".  */
>       && !parser->objc_could_be_foreach_context)
>     return true;
> 
> Maybe you can give this a shot before applying these patches?

I tried the example above with and without my patch and with and without
modifying c_parser_next_token_starts_typename like so:

static inline bool
c_parser_next_token_starts_typename (c_parser *parser)
{
  c_token *token = c_parser_peek_token (parser);

  /* Try a bit harder to detect an unknown typename.  */
  if (token->type == CPP_NAME
      && token->id_kind == C_ID_ID
      && (c_parser_peek_2nd_token (parser)->type == CPP_NAME
          || c_parser_peek_2nd_token (parser)->type == CPP_MULT)
      && !lookup_name (token->value)
      /* Do not try too hard when we could have "object in array".  */
      && !parser->objc_could_be_foreach_context)
    return true;

  return c_token_starts_typename (token);
}

and with:

static inline bool
c_parser_next_token_starts_typename (c_parser *parser)
{
  c_token *token = c_parser_peek_token (parser);

  if (c_token_starts_typename (token))
    return true;

  /* Try a bit harder to detect an unknown typename.  */
  if (token->type == CPP_NAME
      && token->id_kind == C_ID_ID
      && (c_parser_peek_2nd_token (parser)->type == CPP_NAME
          || c_parser_peek_2nd_token (parser)->type == CPP_MULT)
      && !lookup_name (token->value)
      /* Do not try too hard when we could have "object in array".  */
      && !parser->objc_could_be_foreach_context)
    return true;

  return false;
}

since I wasn't sure which version you meant.  (I think the latter, for
agreement with c_parser_next_tokens_start_declaration.)  The error
message in all cases is:

/home/froydnj/src/paolo.c:7:21: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'pid_t'
/home/froydnj/src/paolo.c:11:3: error: unknown type name 'pid_t'

so my patch doesn't seem to have much effect on this testcase.  I think
the saving grace on your testcase is that in:

  typedef uintt16_t pid_t;

`pid_t' is not a keyword, so we don't stop parsing early, but I haven't
traced through the parser to verify that.

-Nathan
Paolo Bonzini - Nov. 18, 2010, 7:33 p.m.
On 11/18/2010 08:04 PM, Nathan Froyd wrote:
> I think
> the saving grace on your testcase is that in:
>
>    typedef uintt16_t pid_t;
>
> `pid_t' is not a keyword, so we don't stop parsing early, but I haven't
> traced through the parser to verify that.

Yes, that's possible.  In that case, I have no objection to the patch; 
the testcase is quite weird indeed in having a keyword after the id.

For what is worth, my WIP patch has no effect on your testcase:

$ ./cc1
typedef BYTE unsigned char;
BYTE x;
<stdin>:1:14: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ 
before ‘unsigned’
<stdin>:2:1: error: unknown type name ‘BYTE’

but fixes the other:

$ ./cc1
typedef uintt16_t pid_t;
pid_t x;
<stdin>:1:1: error: unknown type name ‘uintt16_t’

Thanks,

Paolo

Patch

diff --git a/gcc/testsuite/gcc.dg/noncompile/920923-1.c b/gcc/testsuite/gcc.dg/noncompile/920923-1.c
index f586a7c..a524931 100644
--- a/gcc/testsuite/gcc.dg/noncompile/920923-1.c
+++ b/gcc/testsuite/gcc.dg/noncompile/920923-1.c
@@ -6,9 +6,9 @@  struct PENT { caddr_t v_addr; };/* { dg-error "expected" } */
 typedef struct PENT prec;
 typedef struct PENT *prec_t;
 prec_t mem_hash;
-BYTE *mem_base;			/* { dg-error "unknown type name" } */
+BYTE *mem_base;
 struct PTE {
-     BYTE *p_page;		/* { dg-error "expected" } */
+     BYTE *p_page;
      perm_set p_perms;
 };
 typedef struct PTE pte;
@@ -27,7 +27,7 @@  void
 mmu_walk_find(va)
 caddr_t va;			/* { dg-error "unknown type name" } */
 {
-     BYTE *page_addr; /* { dg-error "unknown type name" } */
+     BYTE *page_addr;
      if (mmu_base[Level1(va)]->valid==0x0) { /* { dg-error "undeclared" } */
 	  l1_base = mmu_base[Level1(va)]->(u.p_tablep) = p_alloc(); /* { dg-error "expected|undeclared" } */
 	  mmu_base[Level1(va)]->valid = 0x3;
@@ -102,8 +102,8 @@  extern void *calloc(__SIZE_TYPE__, __SIZE_TYPE__);
 void
 init_mem()
 {
-     mem_base = (BYTE *) calloc(1024, (1<<13)); /* { dg-error "undeclared|expected" } */
-     ((void)((mem_base != (BYTE *)0)	/* { dg-error "expected" } */
+     mem_base = (BYTE *) calloc(1024, (1<<13));
+     ((void)((mem_base != (BYTE *)0)
 	     ? 0
 	     : (__eprintf("Failed assertion`%s'at line%d of`%s'.\n",
 			  "mem_base != (BYTE *)0", 366, "b.c"),