diff mbox series

[10/22] qapi/parser: Fix typing of token membership tests

Message ID 20210422030720.3685766-11-jsnow@redhat.com
State New
Headers show
Series qapi: static typing conversion, pt5a | expand

Commit Message

John Snow April 22, 2021, 3:07 a.m. UTC
When the token can be None, we can't use 'x in "abc"' style membership
tests to group types of tokens together, because 'None in "abc"' is a
TypeError.

Easy enough to fix, if not a little ugly.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 scripts/qapi/parser.py | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Markus Armbruster April 25, 2021, 7:59 a.m. UTC | #1
John Snow <jsnow@redhat.com> writes:

> When the token can be None, we can't use 'x in "abc"' style membership
> tests to group types of tokens together, because 'None in "abc"' is a
> TypeError.
>
> Easy enough to fix, if not a little ugly.
>
> Signed-off-by: John Snow <jsnow@redhat.com>
> ---
>  scripts/qapi/parser.py | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
> index 7f3c009f64b..16fd36f8391 100644
> --- a/scripts/qapi/parser.py
> +++ b/scripts/qapi/parser.py
> @@ -272,7 +272,7 @@ def get_values(self):
>          if self.tok == ']':
>              self.accept()
>              return expr
> -        if self.tok not in "{['tf":
> +        if self.tok is None or self.tok not in "{['tf":
>              raise QAPIParseError(
>                  self, "expected '{', '[', ']', string, or boolean")
>          while True:
> @@ -294,7 +294,8 @@ def get_expr(self, nested):
>          elif self.tok == '[':
>              self.accept()
>              expr = self.get_values()
> -        elif self.tok in "'tf":
> +        elif self.tok and self.tok in "'tf":
> +            assert isinstance(self.val, (str, bool))
>              expr = self.val
>              self.accept()
>          else:

How can self.tok be None?

I suspect this is an artifact of PATCH 04.  Before, self.tok is
initialized to the first token, then set to subsequent tokens (all str)
in turn.  After, it's initialized to None, then set to tokens in turn.
John Snow April 26, 2021, 5:51 p.m. UTC | #2
On 4/25/21 3:59 AM, Markus Armbruster wrote:
> John Snow <jsnow@redhat.com> writes:
> 
>> When the token can be None, we can't use 'x in "abc"' style membership
>> tests to group types of tokens together, because 'None in "abc"' is a
>> TypeError.
>>
>> Easy enough to fix, if not a little ugly.
>>
>> Signed-off-by: John Snow <jsnow@redhat.com>
>> ---
>>   scripts/qapi/parser.py | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
>> index 7f3c009f64b..16fd36f8391 100644
>> --- a/scripts/qapi/parser.py
>> +++ b/scripts/qapi/parser.py
>> @@ -272,7 +272,7 @@ def get_values(self):
>>           if self.tok == ']':
>>               self.accept()
>>               return expr
>> -        if self.tok not in "{['tf":
>> +        if self.tok is None or self.tok not in "{['tf":
>>               raise QAPIParseError(
>>                   self, "expected '{', '[', ']', string, or boolean")
>>           while True:
>> @@ -294,7 +294,8 @@ def get_expr(self, nested):
>>           elif self.tok == '[':
>>               self.accept()
>>               expr = self.get_values()
>> -        elif self.tok in "'tf":
>> +        elif self.tok and self.tok in "'tf":
>> +            assert isinstance(self.val, (str, bool))
>>               expr = self.val
>>               self.accept()
>>           else:
> 
> How can self.tok be None?
> 
> I suspect this is an artifact of PATCH 04.  Before, self.tok is
> initialized to the first token, then set to subsequent tokens (all str)
> in turn.  After, it's initialized to None, then set to tokens in turn.
> 

Actually, it's set to None to represent EOF. See here:

             elif self.tok == '\n':
	        if self.cursor == len(self.src):
                     self.tok = None
                     return

A more pythonic idiom would be to create a lexer class that behaves as 
an iterator, yielding Token class objects, and eventually, raising 
StopIteration.

(Not suggesting I do that now. I have thought about it though, yes.)

--js
Markus Armbruster April 27, 2021, 7 a.m. UTC | #3
John Snow <jsnow@redhat.com> writes:

> On 4/25/21 3:59 AM, Markus Armbruster wrote:
>> John Snow <jsnow@redhat.com> writes:
>> 
>>> When the token can be None, we can't use 'x in "abc"' style membership
>>> tests to group types of tokens together, because 'None in "abc"' is a
>>> TypeError.
>>>
>>> Easy enough to fix, if not a little ugly.
>>>
>>> Signed-off-by: John Snow <jsnow@redhat.com>
>>> ---
>>>   scripts/qapi/parser.py | 5 +++--
>>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
>>> index 7f3c009f64b..16fd36f8391 100644
>>> --- a/scripts/qapi/parser.py
>>> +++ b/scripts/qapi/parser.py
>>> @@ -272,7 +272,7 @@ def get_values(self):
>>>           if self.tok == ']':
>>>               self.accept()
>>>               return expr
>>> -        if self.tok not in "{['tf":
>>> +        if self.tok is None or self.tok not in "{['tf":
>>>               raise QAPIParseError(
>>>                   self, "expected '{', '[', ']', string, or boolean")
>>>           while True:
>>> @@ -294,7 +294,8 @@ def get_expr(self, nested):
>>>           elif self.tok == '[':
>>>               self.accept()
>>>               expr = self.get_values()
>>> -        elif self.tok in "'tf":
>>> +        elif self.tok and self.tok in "'tf":
>>> +            assert isinstance(self.val, (str, bool))
>>>               expr = self.val
>>>               self.accept()
>>>           else:
>> 
>> How can self.tok be None?
>> 
>> I suspect this is an artifact of PATCH 04.  Before, self.tok is
>> initialized to the first token, then set to subsequent tokens (all str)
>> in turn.  After, it's initialized to None, then set to tokens in turn.
>> 
>
> Actually, it's set to None to represent EOF. See here:
>
>              elif self.tok == '\n':
> 	        if self.cursor == len(self.src):
>                      self.tok = None
>                      return

Alright, then this is actually a bug fix:

    $ echo -n "{'key': " | python3 scripts/qapi-gen.py /dev/stdin
    Traceback (most recent call last):
      File "scripts/qapi-gen.py", line 19, in <module>
        sys.exit(main.main())
      File "/work/armbru/qemu/scripts/qapi/main.py", line 93, in main
        generate(args.schema,
      File "/work/armbru/qemu/scripts/qapi/main.py", line 50, in generate
        schema = QAPISchema(schema_file)
      File "/work/armbru/qemu/scripts/qapi/schema.py", line 852, in __init__
        parser = QAPISchemaParser(fname)
      File "/work/armbru/qemu/scripts/qapi/parser.py", line 59, in __init__
        self._parse()
      File "/work/armbru/qemu/scripts/qapi/parser.py", line 81, in _parse
        expr = self.get_expr(False)
      File "/work/armbru/qemu/scripts/qapi/parser.py", line 293, in get_expr
        expr = self.get_members()
      File "/work/armbru/qemu/scripts/qapi/parser.py", line 260, in get_members
        expr[key] = self.get_expr(True)
      File "/work/armbru/qemu/scripts/qapi/parser.py", line 297, in get_expr
        elif self.tok in "'tf":
    TypeError: 'in <string>' requires string as left operand, not NoneType

Likewise, the other hunk:

    $ echo -n "{'key': [" | python3 scripts/qapi-gen.py /dev/stdin
    Traceback (most recent call last):
      File "scripts/qapi-gen.py", line 19, in <module>
        sys.exit(main.main())
      File "/work/armbru/qemu/scripts/qapi/main.py", line 89, in main
        generate(args.schema,
      File "/work/armbru/qemu/scripts/qapi/main.py", line 51, in generate
        schema = QAPISchema(schema_file)
      File "/work/armbru/qemu/scripts/qapi/schema.py", line 860, in __init__
        parser = QAPISchemaParser(fname)
      File "/work/armbru/qemu/scripts/qapi/parser.py", line 71, in __init__
        expr = self.get_expr(False)
      File "/work/armbru/qemu/scripts/qapi/parser.py", line 270, in get_expr
        expr = self.get_members()
      File "/work/armbru/qemu/scripts/qapi/parser.py", line 238, in get_members
        expr[key] = self.get_expr(True)
      File "/work/armbru/qemu/scripts/qapi/parser.py", line 273, in get_expr
        expr = self.get_values()
      File "/work/armbru/qemu/scripts/qapi/parser.py", line 253, in get_values
        if self.tok not in "{['tf":
    TypeError: 'in <string>' requires string as left operand, not NoneType

Please add test cases.  I recommend adding them in a separate patch, so
this one's diff shows clearly what's being fixed.

There's a similar one in accept(), but it's safe:

            self.tok = self.src[self.cursor]
            ...
            elif self.tok in '{}:,[]':
                return

Regarding "if not a little ugly": instead of

    self.tok is None or self.tok not in "{['tf"

we could use

    self.tok not in tuple("{['tf")

> A more pythonic idiom would be to create a lexer class that behaves as 
> an iterator, yielding Token class objects, and eventually, raising 
> StopIteration.
>
> (Not suggesting I do that now. I have thought about it though, yes.)

Yes, let's resist the temptation to improve things into too many
directions at once.

Aside: using exceptions for perfectly unexceptional things like loop
termination is in questionable taste, but we gotta go with the Python
flow.
John Snow May 4, 2021, 1:01 a.m. UTC | #4
On 4/27/21 3:00 AM, Markus Armbruster wrote:
> John Snow <jsnow@redhat.com> writes:
> 
>> On 4/25/21 3:59 AM, Markus Armbruster wrote:
>>> John Snow <jsnow@redhat.com> writes:
>>>
>>>> When the token can be None, we can't use 'x in "abc"' style membership
>>>> tests to group types of tokens together, because 'None in "abc"' is a
>>>> TypeError.
>>>>
>>>> Easy enough to fix, if not a little ugly.
>>>>
>>>> Signed-off-by: John Snow <jsnow@redhat.com>
>>>> ---
>>>>    scripts/qapi/parser.py | 5 +++--
>>>>    1 file changed, 3 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
>>>> index 7f3c009f64b..16fd36f8391 100644
>>>> --- a/scripts/qapi/parser.py
>>>> +++ b/scripts/qapi/parser.py
>>>> @@ -272,7 +272,7 @@ def get_values(self):
>>>>            if self.tok == ']':
>>>>                self.accept()
>>>>                return expr
>>>> -        if self.tok not in "{['tf":
>>>> +        if self.tok is None or self.tok not in "{['tf":
>>>>                raise QAPIParseError(
>>>>                    self, "expected '{', '[', ']', string, or boolean")
>>>>            while True:
>>>> @@ -294,7 +294,8 @@ def get_expr(self, nested):
>>>>            elif self.tok == '[':
>>>>                self.accept()
>>>>                expr = self.get_values()
>>>> -        elif self.tok in "'tf":
>>>> +        elif self.tok and self.tok in "'tf":
>>>> +            assert isinstance(self.val, (str, bool))
>>>>                expr = self.val
>>>>                self.accept()
>>>>            else:
>>>
>>> How can self.tok be None?
>>>
>>> I suspect this is an artifact of PATCH 04.  Before, self.tok is
>>> initialized to the first token, then set to subsequent tokens (all str)
>>> in turn.  After, it's initialized to None, then set to tokens in turn.
>>>
>>
>> Actually, it's set to None to represent EOF. See here:
>>
>>               elif self.tok == '\n':
>> 	        if self.cursor == len(self.src):
>>                       self.tok = None
>>                       return
> 
> Alright, then this is actually a bug fix:
> 
>      $ echo -n "{'key': " | python3 scripts/qapi-gen.py /dev/stdin
>      Traceback (most recent call last):
>        File "scripts/qapi-gen.py", line 19, in <module>
>          sys.exit(main.main())
>        File "/work/armbru/qemu/scripts/qapi/main.py", line 93, in main
>          generate(args.schema,
>        File "/work/armbru/qemu/scripts/qapi/main.py", line 50, in generate
>          schema = QAPISchema(schema_file)
>        File "/work/armbru/qemu/scripts/qapi/schema.py", line 852, in __init__
>          parser = QAPISchemaParser(fname)
>        File "/work/armbru/qemu/scripts/qapi/parser.py", line 59, in __init__
>          self._parse()
>        File "/work/armbru/qemu/scripts/qapi/parser.py", line 81, in _parse
>          expr = self.get_expr(False)
>        File "/work/armbru/qemu/scripts/qapi/parser.py", line 293, in get_expr
>          expr = self.get_members()
>        File "/work/armbru/qemu/scripts/qapi/parser.py", line 260, in get_members
>          expr[key] = self.get_expr(True)
>        File "/work/armbru/qemu/scripts/qapi/parser.py", line 297, in get_expr
>          elif self.tok in "'tf":
>      TypeError: 'in <string>' requires string as left operand, not NoneType
> 
> Likewise, the other hunk:
> 
>      $ echo -n "{'key': [" | python3 scripts/qapi-gen.py /dev/stdin
>      Traceback (most recent call last):
>        File "scripts/qapi-gen.py", line 19, in <module>
>          sys.exit(main.main())
>        File "/work/armbru/qemu/scripts/qapi/main.py", line 89, in main
>          generate(args.schema,
>        File "/work/armbru/qemu/scripts/qapi/main.py", line 51, in generate
>          schema = QAPISchema(schema_file)
>        File "/work/armbru/qemu/scripts/qapi/schema.py", line 860, in __init__
>          parser = QAPISchemaParser(fname)
>        File "/work/armbru/qemu/scripts/qapi/parser.py", line 71, in __init__
>          expr = self.get_expr(False)
>        File "/work/armbru/qemu/scripts/qapi/parser.py", line 270, in get_expr
>          expr = self.get_members()
>        File "/work/armbru/qemu/scripts/qapi/parser.py", line 238, in get_members
>          expr[key] = self.get_expr(True)
>        File "/work/armbru/qemu/scripts/qapi/parser.py", line 273, in get_expr
>          expr = self.get_values()
>        File "/work/armbru/qemu/scripts/qapi/parser.py", line 253, in get_values
>          if self.tok not in "{['tf":
>      TypeError: 'in <string>' requires string as left operand, not NoneType
> 
> Please add test cases.  I recommend adding them in a separate patch, so
> this one's diff shows clearly what's being fixed.
> 

Can't, again: because it's a crash, the test runner explodes.

Two choices, because I won't finish respinning this tonight:

(1) Amend the test runner to print generic exceptions using str(err), 
without the stack trace -- so we can check for crashes using the diffs 
-- again in its own commit.

(2) Just squish the tests and error messages into this commit like I did 
for the other crash fix I checked in.

I'd normally leap for #1, but you seem to have some affinity for 
allowing unpredictable things to explode very violently, so I am not sure.

--js
Markus Armbruster May 5, 2021, 6:29 a.m. UTC | #5
John Snow <jsnow@redhat.com> writes:

> On 4/27/21 3:00 AM, Markus Armbruster wrote:
>> John Snow <jsnow@redhat.com> writes:
>> 
>>> On 4/25/21 3:59 AM, Markus Armbruster wrote:

[...]

>> Please add test cases.  I recommend adding them in a separate patch, so
>> this one's diff shows clearly what's being fixed.
>> 
>
> Can't, again: because it's a crash, the test runner explodes.
>
> Two choices, because I won't finish respinning this tonight:
>
> (1) Amend the test runner to print generic exceptions using str(err), 
> without the stack trace -- so we can check for crashes using the diffs 
> -- again in its own commit.
>
> (2) Just squish the tests and error messages into this commit like I did 
> for the other crash fix I checked in.
>
> I'd normally leap for #1, but you seem to have some affinity for 
> allowing unpredictable things to explode very violently, so I am not sure.

I love violent explosions.  Don't we all, as long as they're just bits?

(2) is fine.

If you'd like to provide for committing tests that currently explode:
the issue preventing it is insufficiently normalized output of
test-qapi.py.  test-qapi.py normalizes error messages (see except
QAPIError in test_and_diff()), but not tracebacks.

Omitting the tracebacks is an obvious and easy way to normalize.  But it
makes getting at the traceback harder: I need to know / remember how to
run the test by hand, without the normalization.  The cure seems worse
than the disease here.

To avoid the drawback, we'd need a simple and obvious way to run the
test so it shows the traceback.

Again, (2) is fine.
diff mbox series

Patch

diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
index 7f3c009f64b..16fd36f8391 100644
--- a/scripts/qapi/parser.py
+++ b/scripts/qapi/parser.py
@@ -272,7 +272,7 @@  def get_values(self):
         if self.tok == ']':
             self.accept()
             return expr
-        if self.tok not in "{['tf":
+        if self.tok is None or self.tok not in "{['tf":
             raise QAPIParseError(
                 self, "expected '{', '[', ']', string, or boolean")
         while True:
@@ -294,7 +294,8 @@  def get_expr(self, nested):
         elif self.tok == '[':
             self.accept()
             expr = self.get_values()
-        elif self.tok in "'tf":
+        elif self.tok and self.tok in "'tf":
+            assert isinstance(self.val, (str, bool))
             expr = self.val
             self.accept()
         else: