diff mbox

[1/5,GIMPLE,FE] PR testsuite/80580. Handle missing labels in goto statements

Message ID 377838f2-a96b-3147-20ba-d14b1a4bea40@gmail.com
State New
Headers show

Commit Message

Mikhail Maltsev May 29, 2017, 4:38 a.m. UTC
Hi. Sorry for a long delay.

On 02.05.2017 17:16, Richard Biener wrote:
> Certainly an improvement.  I suppose we can do better error recovery
> for cases like
> 
>  if (1)
>    goto
>  else
>    goto bar;
> 
> but I guess this is better than nothing.
I think it's worth spending a bit more time to get this right.

> 
> And yes, we use c_parser_error -- input_location should be ok but here
> we just peek which may upset the parser.  Maybe it works better
> when consuming the token before issueing the error?  Thus
> 
> Index: gcc/c/gimple-parser.c
> ===================================================================
> --- gcc/c/gimple-parser.c       (revision 247498)
> +++ gcc/c/gimple-parser.c       (working copy)
> @@ -1315,8 +1315,8 @@ c_parser_gimple_if_stmt (c_parser *parse
>        loc = c_parser_peek_token (parser)->location;
>        c_parser_consume_token (parser);
>        label = c_parser_peek_token (parser)->value;
> -      t_label = lookup_label_for_goto (loc, label);
>        c_parser_consume_token (parser);
> +      t_label = lookup_label_for_goto (loc, label);
>        if (! c_parser_require (parser, CPP_SEMICOLON, "expected %<;%>"))
>         return;
>      }
> 
I was more focused on cases with missing labels (i.e. 'label' is NULL), rather
than cases with syntactically correct if-statements referencing undefined
labels. And, frankly speaking, I'm not sure that swapping
'c_parser_consume_token' with 'lookup_label_for_goto' will help, because
'lookup_label_for_goto' already gets a 'loc' parameter.

BTW, unfortunately GIMPLE FE does not handle undefined labels properly. I.e.,
this test case

__GIMPLE() void foo()
{
bb_0:
  if (0)
    goto bb_0;
  else
    goto bb_1;
}

causes an ICE somewhere in build_gimple_cfg/cleanup_dead_labels. But this is a
separate issue, of course.

I attached a slightly modified patch, which incorporates your changes and also uses

  if (! c_parser_next_token_is (parser, CPP_NAME))
    ...

instead of

  label = c_parser_peek_token (parser)->value;
  ...
  if (!label)
    ...

for better readability. This version correctly handles missing labels and
semicolons in both branches of the 'if' statement.

The only major problem, which I want to fix is error recovery in the following
example:

__GIMPLE() void foo()
{
  if (1)
    goto lbl;
  else
    goto ; /* { dg-error "expected label" } */
}

__GIMPLE() void bar()
{
  if (1)
    goto lbl;
  else
    goto
} /* { dg-error "expected label" } */

In this case GCC correctly diagnoses both errors. But if I swap these two
functions so that 'bar' comes before 'foo', the error in 'foo' is not diagnosed.
I did not dive into details, but my speculation is that the parser  enters some
strange state and skips 'foo' entirely (-fdump-tree-gimple contains only 'bar').
If I add another function after 'foo', it is handled correctly.
Any ideas, why this could happen and how to improve error recovery in this case?

Comments

Richard Biener May 30, 2017, 11:46 a.m. UTC | #1
On Mon, May 29, 2017 at 6:38 AM, Mikhail Maltsev <maltsevm@gmail.com> wrote:
> Hi. Sorry for a long delay.
>
> On 02.05.2017 17:16, Richard Biener wrote:
>> Certainly an improvement.  I suppose we can do better error recovery
>> for cases like
>>
>>  if (1)
>>    goto
>>  else
>>    goto bar;
>>
>> but I guess this is better than nothing.
> I think it's worth spending a bit more time to get this right.
>
>>
>> And yes, we use c_parser_error -- input_location should be ok but here
>> we just peek which may upset the parser.  Maybe it works better
>> when consuming the token before issueing the error?  Thus
>>
>> Index: gcc/c/gimple-parser.c
>> ===================================================================
>> --- gcc/c/gimple-parser.c       (revision 247498)
>> +++ gcc/c/gimple-parser.c       (working copy)
>> @@ -1315,8 +1315,8 @@ c_parser_gimple_if_stmt (c_parser *parse
>>        loc = c_parser_peek_token (parser)->location;
>>        c_parser_consume_token (parser);
>>        label = c_parser_peek_token (parser)->value;
>> -      t_label = lookup_label_for_goto (loc, label);
>>        c_parser_consume_token (parser);
>> +      t_label = lookup_label_for_goto (loc, label);
>>        if (! c_parser_require (parser, CPP_SEMICOLON, "expected %<;%>"))
>>         return;
>>      }
>>
> I was more focused on cases with missing labels (i.e. 'label' is NULL), rather
> than cases with syntactically correct if-statements referencing undefined
> labels. And, frankly speaking, I'm not sure that swapping
> 'c_parser_consume_token' with 'lookup_label_for_goto' will help, because
> 'lookup_label_for_goto' already gets a 'loc' parameter.

Ah, indeed.

> BTW, unfortunately GIMPLE FE does not handle undefined labels properly. I.e.,
> this test case
>
> __GIMPLE() void foo()
> {
> bb_0:
>   if (0)
>     goto bb_0;
>   else
>     goto bb_1;
> }
>
> causes an ICE somewhere in build_gimple_cfg/cleanup_dead_labels. But this is a
> separate issue, of course.

Yes.  I think ICEing for invalid GIMPLE (as opposed for syntactic
errors) is OK for now.

> I attached a slightly modified patch, which incorporates your changes and also uses
>
>   if (! c_parser_next_token_is (parser, CPP_NAME))
>     ...
>
> instead of
>
>   label = c_parser_peek_token (parser)->value;
>   ...
>   if (!label)
>     ...
>
> for better readability. This version correctly handles missing labels and
> semicolons in both branches of the 'if' statement.
>
> The only major problem, which I want to fix is error recovery in the following
> example:
>
> __GIMPLE() void foo()
> {
>   if (1)
>     goto lbl;
>   else
>     goto ; /* { dg-error "expected label" } */
> }
>
> __GIMPLE() void bar()
> {
>   if (1)
>     goto lbl;
>   else
>     goto
> } /* { dg-error "expected label" } */
>
> In this case GCC correctly diagnoses both errors. But if I swap these two
> functions so that 'bar' comes before 'foo', the error in 'foo' is not diagnosed.
> I did not dive into details, but my speculation is that the parser  enters some
> strange state and skips 'foo' entirely (-fdump-tree-gimple contains only 'bar').
> If I add another function after 'foo', it is handled correctly.
> Any ideas, why this could happen and how to improve error recovery in this case?

Huh.  I suppose this is due to us testing c_parser_error () to skip
tokens in some places and
not clearing it after (successfully) ending parsing of a function.

Not sure where clearing of parser->error happens usually, it somewhat
looks like it has
to be done manually somewhere up in the callstack (I suppose once we managed to
"recover").  Most c_parser_skip* routines clear error for example.

Richard.

>
> --
> Regards,
>    Mikhail Maltsev
Richard Biener May 30, 2017, 11:47 a.m. UTC | #2
On Tue, May 30, 2017 at 1:46 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Mon, May 29, 2017 at 6:38 AM, Mikhail Maltsev <maltsevm@gmail.com> wrote:
>> Hi. Sorry for a long delay.
>>
>> On 02.05.2017 17:16, Richard Biener wrote:
>>> Certainly an improvement.  I suppose we can do better error recovery
>>> for cases like
>>>
>>>  if (1)
>>>    goto
>>>  else
>>>    goto bar;
>>>
>>> but I guess this is better than nothing.
>> I think it's worth spending a bit more time to get this right.
>>
>>>
>>> And yes, we use c_parser_error -- input_location should be ok but here
>>> we just peek which may upset the parser.  Maybe it works better
>>> when consuming the token before issueing the error?  Thus
>>>
>>> Index: gcc/c/gimple-parser.c
>>> ===================================================================
>>> --- gcc/c/gimple-parser.c       (revision 247498)
>>> +++ gcc/c/gimple-parser.c       (working copy)
>>> @@ -1315,8 +1315,8 @@ c_parser_gimple_if_stmt (c_parser *parse
>>>        loc = c_parser_peek_token (parser)->location;
>>>        c_parser_consume_token (parser);
>>>        label = c_parser_peek_token (parser)->value;
>>> -      t_label = lookup_label_for_goto (loc, label);
>>>        c_parser_consume_token (parser);
>>> +      t_label = lookup_label_for_goto (loc, label);
>>>        if (! c_parser_require (parser, CPP_SEMICOLON, "expected %<;%>"))
>>>         return;
>>>      }
>>>
>> I was more focused on cases with missing labels (i.e. 'label' is NULL), rather
>> than cases with syntactically correct if-statements referencing undefined
>> labels. And, frankly speaking, I'm not sure that swapping
>> 'c_parser_consume_token' with 'lookup_label_for_goto' will help, because
>> 'lookup_label_for_goto' already gets a 'loc' parameter.
>
> Ah, indeed.
>
>> BTW, unfortunately GIMPLE FE does not handle undefined labels properly. I.e.,
>> this test case
>>
>> __GIMPLE() void foo()
>> {
>> bb_0:
>>   if (0)
>>     goto bb_0;
>>   else
>>     goto bb_1;
>> }
>>
>> causes an ICE somewhere in build_gimple_cfg/cleanup_dead_labels. But this is a
>> separate issue, of course.
>
> Yes.  I think ICEing for invalid GIMPLE (as opposed for syntactic
> errors) is OK for now.
>
>> I attached a slightly modified patch, which incorporates your changes and also uses
>>
>>   if (! c_parser_next_token_is (parser, CPP_NAME))
>>     ...
>>
>> instead of
>>
>>   label = c_parser_peek_token (parser)->value;
>>   ...
>>   if (!label)
>>     ...
>>
>> for better readability. This version correctly handles missing labels and
>> semicolons in both branches of the 'if' statement.
>>
>> The only major problem, which I want to fix is error recovery in the following
>> example:
>>
>> __GIMPLE() void foo()
>> {
>>   if (1)
>>     goto lbl;
>>   else
>>     goto ; /* { dg-error "expected label" } */
>> }
>>
>> __GIMPLE() void bar()
>> {
>>   if (1)
>>     goto lbl;
>>   else
>>     goto
>> } /* { dg-error "expected label" } */
>>
>> In this case GCC correctly diagnoses both errors. But if I swap these two
>> functions so that 'bar' comes before 'foo', the error in 'foo' is not diagnosed.
>> I did not dive into details, but my speculation is that the parser  enters some
>> strange state and skips 'foo' entirely (-fdump-tree-gimple contains only 'bar').
>> If I add another function after 'foo', it is handled correctly.
>> Any ideas, why this could happen and how to improve error recovery in this case?
>
> Huh.  I suppose this is due to us testing c_parser_error () to skip
> tokens in some places and
> not clearing it after (successfully) ending parsing of a function.
>
> Not sure where clearing of parser->error happens usually, it somewhat
> looks like it has
> to be done manually somewhere up in the callstack (I suppose once we managed to
> "recover").  Most c_parser_skip* routines clear error for example.

Oh, and the patch you posted is ok.

Thanks,
Richard.

> Richard.
>
>>
>> --
>> Regards,
>>    Mikhail Maltsev
diff mbox

Patch

diff --git a/gcc/c/gimple-parser.c b/gcc/c/gimple-parser.c
index ed9e7c5..44ca738 100644
--- a/gcc/c/gimple-parser.c
+++ b/gcc/c/gimple-parser.c
@@ -1336,9 +1336,14 @@  c_parser_gimple_if_stmt (c_parser *parser, gimple_seq *seq)
     {
       loc = c_parser_peek_token (parser)->location;
       c_parser_consume_token (parser);
+      if (! c_parser_next_token_is (parser, CPP_NAME))
+	{
+	  c_parser_error (parser, "expected label");
+	  return;
+	}
       label = c_parser_peek_token (parser)->value;
-      t_label = lookup_label_for_goto (loc, label);
       c_parser_consume_token (parser);
+      t_label = lookup_label_for_goto (loc, label);
       if (! c_parser_require (parser, CPP_SEMICOLON, "expected %<;%>"))
 	return;
     }
@@ -1360,9 +1365,14 @@  c_parser_gimple_if_stmt (c_parser *parser, gimple_seq *seq)
     {
       loc = c_parser_peek_token (parser)->location;
       c_parser_consume_token (parser);
+      if (! c_parser_next_token_is (parser, CPP_NAME))
+	{
+	  c_parser_error (parser, "expected label");
+	  return;
+	}
       label = c_parser_peek_token (parser)->value;
-      f_label = lookup_label_for_goto (loc, label);
       c_parser_consume_token (parser);
+      f_label = lookup_label_for_goto (loc, label);
       if (! c_parser_require (parser, CPP_SEMICOLON, "expected %<;%>"))
 	return;
     }
diff --git a/gcc/testsuite/gcc.dg/gimplefe-error-7.c b/gcc/testsuite/gcc.dg/gimplefe-error-7.c
new file mode 100644
index 0000000..7d5ff37
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gimplefe-error-7.c
@@ -0,0 +1,27 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fgimple" } */
+
+__GIMPLE() void fn1()
+{
+  if (1)
+    goto
+  else /* { dg-error "expected label" } */
+    goto lbl;
+}
+
+__GIMPLE() void fn2()
+{
+  if (1)
+    goto lbl;
+  else
+    goto ; /* { dg-error "expected label" } */
+}
+
+__GIMPLE() void fn3()
+{
+  if (1)
+    goto lbl;
+  else
+    goto
+} /* { dg-error "expected label" } */
+