diff mbox

fixup! qapi: Forbid case-insensitive clashes

Message ID 1447887345-1170-1-git-send-email-eblake@redhat.com
State New
Headers show

Commit Message

Eric Blake Nov. 18, 2015, 10:55 p.m. UTC
[Replace the old commit message with this:]

qapi: Forbid case-insensitive clashes

In general, designing user interfaces that rely on case
distinction is poor practice.  Another benefit of enforcing
a restriction against case-insensitive clashes is that we
no longer have to worry about the situation of enum values
that could be distinguished by case if mapped by c_name(),
but which cannot be distinguished when mapped to C as
ALL_CAPS by c_enum_const() [via c_name(name, False).upper()].
Thus, having the generator look for case collisions up front
will prevent developers from worrying whether different
munging rules for member names compared to enum values as a
discriminator will cause any problems in qapi unions.

We do not have to care about c_name(name, True) vs.
c_name(name, False), as long as any pair of munged names being
compared were munged using the same flag value to c_name().
This is because commit 9fb081e already reserved names munging
to 'q_', and this patch extends the reservation to 'Q_'; and
because recent patches fixed things to ensure the only thing
done by the flag is adding the prefix 'q_', that the only use
of '.' (which also munges to '_') is in downstream extension
prefixes, and that enum munging does not add underscores to
any CamelCase values.

However, we DO have to care about the fact that we have a
command 'stop' and an event 'STOP' (currently the only case
collision of any of our .json entities).  To solve that, add
a new QAPISchemaEntity.c_namecase() that returns a munged
version of the name that can be used in checking for case
clashes, and override it in QAPISchemaEvent to use a different
case (thus, indexing by .c_namecase() will create two natural
key partitions), then add a new _clash_dict to
QAPISchema._def_entity() that checks for case collisions after
ruling out identical spellings.  This works because all
entities have at least a leading letter in their name.

We could go further and enforce that events are named with
'ALL_CAPS' and that nothing else is named in that manner; but
that can be done as a followup if desired.  We could also
separate commands from type names (the way we already separated
events into a separate namespace), but then we'd have to make
.c_namecase() pick something other than the convenient upper()
vs. lower() in order to partition dictionary keys into three
sets (perhaps a leading digit, since all entities start with a
letter).

There is also the possibility that we may want to add a future
extension to QMP of teaching it to be case-insensitive (the
user could request command 'Quit' instead of 'quit', or could
spell a struct field as 'CPU' instead of 'cpu').  But for that
to be a practical extension, we cannot break backwards
compatibility with any existing struct that was already
relying on case sensitivity.  Fortunately, after a recent
patch cleaned up CpuInfo, there are no such existing qapi
structs.  Of course, the idea of a future extension is not
as strong of a reason to make the change.

At any rate, it is easier to be strict now, and relax things
later if we find a reason to need case-sensitive QMP members,
than it would be to wish we had the restriction in place.

Add new tests args-case-clash.json and command-type-case-clash.json
to demonstrate that the collision detection works.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
Use a separate clash dictionary, fix a typo that we couldn't
trigger, add .c_namecase(), no longer need to use attrgetter
---
 scripts/qapi.py | 48 ++++++++++++++++++------------------------------
 1 file changed, 18 insertions(+), 30 deletions(-)

Comments

Markus Armbruster Nov. 19, 2015, 1:32 p.m. UTC | #1
Eric Blake <eblake@redhat.com> writes:

> [Replace the old commit message with this:]
>
> qapi: Forbid case-insensitive clashes
>
> In general, designing user interfaces that rely on case
> distinction is poor practice.  Another benefit of enforcing
> a restriction against case-insensitive clashes is that we
> no longer have to worry about the situation of enum values
> that could be distinguished by case if mapped by c_name(),
> but which cannot be distinguished when mapped to C as
> ALL_CAPS by c_enum_const() [via c_name(name, False).upper()].
> Thus, having the generator look for case collisions up front
> will prevent developers from worrying whether different
> munging rules for member names compared to enum values as a
> discriminator will cause any problems in qapi unions.
>
> We do not have to care about c_name(name, True) vs.
> c_name(name, False), as long as any pair of munged names being
> compared were munged using the same flag value to c_name().
> This is because commit 9fb081e already reserved names munging
> to 'q_', and this patch extends the reservation to 'Q_'; and
> because recent patches fixed things to ensure the only thing
> done by the flag is adding the prefix 'q_', that the only use
> of '.' (which also munges to '_') is in downstream extension
> prefixes, and that enum munging does not add underscores to
> any CamelCase values.
>
> However, we DO have to care about the fact that we have a
> command 'stop' and an event 'STOP' (currently the only case
> collision of any of our .json entities).  To solve that, add
> a new QAPISchemaEntity.c_namecase() that returns a munged
> version of the name that can be used in checking for case
> clashes,

For [-._] clashes, too.  Easiest fix it to just say "checking for
clashes".

>          and override it in QAPISchemaEvent to use a different
> case (thus, indexing by .c_namecase() will create two natural
> key partitions), then add a new _clash_dict to
> QAPISchema._def_entity() that checks for case collisions after

To QAPISchema, actually.  In _def_entity(), we only add an entry.

> ruling out identical spellings.  This works because all
> entities have at least a leading letter in their name.
>
> We could go further and enforce that events are named with
> 'ALL_CAPS' and that nothing else is named in that manner; but
> that can be done as a followup if desired.  We could also
> separate commands from type names (the way we already separated
> events into a separate namespace), but then we'd have to make
> .c_namecase() pick something other than the convenient upper()
> vs. lower() in order to partition dictionary keys into three
> sets (perhaps a leading digit, since all entities start with a
> letter).

The upper / lower partition works for events / anything else, and is
kind of cute, although two separate clash dictionaries might be clearer.
I'd like to sketch that next, to help us decide.

Even if we decide to stick to the "partition the key space" idea now,
I'm reluctant to push it farther.  Prepending a digit would feel odd.
Title case would be slightly better.  But I think three separate clash
dictionaries would be simpler and clearer then.

Depending on what we do, the paragraph above may need some more
tweaking.

> There is also the possibility that we may want to add a future
> extension to QMP of teaching it to be case-insensitive (the
> user could request command 'Quit' instead of 'quit', or could
> spell a struct field as 'CPU' instead of 'cpu').  But for that
> to be a practical extension, we cannot break backwards
> compatibility with any existing struct that was already
> relying on case sensitivity.  Fortunately, after a recent
> patch cleaned up CpuInfo, there are no such existing qapi
> structs.  Of course, the idea of a future extension is not
> as strong of a reason to make the change.
>
> At any rate, it is easier to be strict now, and relax things
> later if we find a reason to need case-sensitive QMP members,
> than it would be to wish we had the restriction in place.
>
> Add new tests args-case-clash.json and command-type-case-clash.json
> to demonstrate that the collision detection works.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
>
> ---
> Use a separate clash dictionary, fix a typo that we couldn't
> trigger, add .c_namecase(), no longer need to use attrgetter
> ---
>  scripts/qapi.py | 48 ++++++++++++++++++------------------------------
>  1 file changed, 18 insertions(+), 30 deletions(-)

I find the squashed patch easier to understand, so here it is:

| diff --git a/scripts/qapi.py b/scripts/qapi.py
| index 72925c3..114e07a 100644
| --- a/scripts/qapi.py
| +++ b/scripts/qapi.py
| @@ -378,9 +378,9 @@ def check_name(expr_info, source, name, allow_optional=False,
|      # code always prefixes it with the enum name
|      if enum_member and membername[0].isdigit():
|          membername = 'D' + membername
| -    # Reserve the entire 'q_' namespace for c_name()
| +    # Reserve the entire 'q_'/'Q_' namespace for c_name()
|      if not valid_name.match(membername) or \
| -       c_name(membername, False).startswith('q_'):
| +       c_name(membername, False).lower().startswith('q_'):
|          raise QAPIExprError(expr_info,
|                              "%s uses invalid name '%s'" % (source, name))
|  
| @@ -800,6 +800,9 @@ class QAPISchemaEntity(object):
|      def c_name(self):
|          return c_name(self.name)
|  
| +    def c_namecase(self):
| +        return c_name(self.name).lower()
| +
|      def check(self, schema):
|          pass
|  

The entities are: commands, events, the various types.  This
implementation of c_namecase() covers commands and types.  It gets
overwritten for events.

Commands should not use upper case.  If we enforced that, the .lower()
would be superfluous there.  But it isn't wrong.

| @@ -1040,7 +1043,7 @@ class QAPISchemaObjectTypeMember(object):
|          assert self.type
|  
|      def check_clash(self, info, seen):
| -        cname = c_name(self.name)
| +        cname = c_name(self.name).lower()
|          if cname in seen:
|              raise QAPIExprError(info,
|                                  "%s collides with %s"

Not an entity, therefore QAPISchemaEntity.c_namecase() isn't inherited.

Members should not use upper case.  If we enforced that, the .lower()
would be superfluous.

| @@ -1085,7 +1088,7 @@ class QAPISchemaObjectTypeVariants(object):
|  
|      def check(self, schema, seen):
|          if not self.tag_member:    # flat union
| -            self.tag_member = seen[c_name(self.tag_name)]
| +            self.tag_member = seen[c_name(self.tag_name).lower()]
|              assert self.tag_name == self.tag_member.name
|          assert isinstance(self.tag_member.type, QAPISchemaEnumType)
|          for v in self.variants:

Key needs to match whatever we use in
QAPISchemaObjectTypeMember.check_clash().  It does.

| @@ -1184,12 +1187,16 @@ class QAPISchemaEvent(QAPISchemaEntity):
|      def visit(self, visitor):
|          visitor.visit_event(self.name, self.info, self.arg_type)
|  
| +    def c_namecase(self):
| +        return c_name(self.name).upper()
| +

Event names should not use lower case.  If we enforced that, the
.upper() would be superfluous.

If we enforced our naming conventions, namely types in CamelCase, events
in ALL_CAPS, everything else lower-case, the c_namecase() would become
slightly simpler: it's same as c_name() except for types, where it
additionally folds to lower case.  The name would be misleading then,
however.  Anyway, we can consider that once we enforce naming
conventions.

|  
|  class QAPISchema(object):
|      def __init__(self, fname):
|          try:
|              self.exprs = check_exprs(QAPISchemaParser(open(fname, "r")).exprs)
|              self._entity_dict = {}
| +            self._clash_dict = {}
|              self._predefining = True
|              self._def_predefineds()
|              self._predefining = False
| @@ -1203,7 +1210,13 @@ class QAPISchema(object):
|          # Only the predefined types are allowed to not have info
|          assert ent.info or self._predefining
|          assert ent.name not in self._entity_dict
| +        cname = ent.c_namecase()
| +        if cname in self._clash_dict:
| +            raise QAPIExprError(ent.info,
| +                                "Entity '%s' collides with '%s'"
| +                                % (ent.name, self._clash_dict[cname]))
|          self._entity_dict[ent.name] = ent
| +        self._clash_dict[cname] = ent.name
|  
|      def lookup_entity(self, name, typ=None):
|          ent = self._entity_dict.get(name)
Eric Blake Nov. 19, 2015, 3:20 p.m. UTC | #2
On 11/19/2015 06:32 AM, Markus Armbruster wrote:
> Eric Blake <eblake@redhat.com> writes:
> 
>> [Replace the old commit message with this:]
>>
>> qapi: Forbid case-insensitive clashes
>>

>> However, we DO have to care about the fact that we have a
>> command 'stop' and an event 'STOP' (currently the only case
>> collision of any of our .json entities).  To solve that, add
>> a new QAPISchemaEntity.c_namecase() that returns a munged
>> version of the name that can be used in checking for case
>> clashes,
> 
> For [-._] clashes, too.  Easiest fix it to just say "checking for
> clashes".
> 
>>          and override it in QAPISchemaEvent to use a different
>> case (thus, indexing by .c_namecase() will create two natural
>> key partitions), then add a new _clash_dict to
>> QAPISchema._def_entity() that checks for case collisions after
> 
> To QAPISchema, actually.  In _def_entity(), we only add an entry.

Easy fixes.

> 
>> ruling out identical spellings.  This works because all
>> entities have at least a leading letter in their name.
>>
>> We could go further and enforce that events are named with
>> 'ALL_CAPS' and that nothing else is named in that manner; but
>> that can be done as a followup if desired.  We could also
>> separate commands from type names (the way we already separated
>> events into a separate namespace), but then we'd have to make
>> .c_namecase() pick something other than the convenient upper()
>> vs. lower() in order to partition dictionary keys into three
>> sets (perhaps a leading digit, since all entities start with a
>> letter).
> 
> The upper / lower partition works for events / anything else, and is
> kind of cute, although two separate clash dictionaries might be clearer.
> I'd like to sketch that next, to help us decide.

Okay, sounds like I should post a v12.5 with the alternative of 2
separate clash dictionaries (3 dictionaries total), and as a complete
patch rather than a fixup.


> | @@ -800,6 +800,9 @@ class QAPISchemaEntity(object):
> |      def c_name(self):
> |          return c_name(self.name)
> |  
> | +    def c_namecase(self):
> | +        return c_name(self.name).lower()
> | +
> |      def check(self, schema):
> |          pass
> |  
> 
> The entities are: commands, events, the various types.  This
> implementation of c_namecase() covers commands and types.  It gets
> overwritten for events.
> 
> Commands should not use upper case.  If we enforced that, the .lower()
> would be superfluous there.  But it isn't wrong.

I won't need c_namecase() with separate dictionaries.  Once we have the
two approaches to compare, we can decide which one looks nicer.

> 
> | @@ -1040,7 +1043,7 @@ class QAPISchemaObjectTypeMember(object):
> |          assert self.type
> |  
> |      def check_clash(self, info, seen):
> | -        cname = c_name(self.name)
> | +        cname = c_name(self.name).lower()
> |          if cname in seen:
> |              raise QAPIExprError(info,
> |                                  "%s collides with %s"
> 
> Not an entity, therefore QAPISchemaEntity.c_namecase() isn't inherited.
> 
> Members should not use upper case.  If we enforced that, the .lower()
> would be superfluous.

Even if we enforce it, we have to whitelist exceptions for backward
compatibility; those exceptions would have to use .lower(), or we can
just skip collision detection on those whitelisted types (where the
existence of the whitelist reminds us to maintain no case collisions by
hand).


> | @@ -1184,12 +1187,16 @@ class QAPISchemaEvent(QAPISchemaEntity):
> |      def visit(self, visitor):
> |          visitor.visit_event(self.name, self.info, self.arg_type)
> |  
> | +    def c_namecase(self):
> | +        return c_name(self.name).upper()
> | +
> 
> Event names should not use lower case.  If we enforced that, the
> .upper() would be superfluous.

Looks like we could enforce this one without a whitelist.

> 
> If we enforced our naming conventions, namely types in CamelCase, events
> in ALL_CAPS, everything else lower-case, the c_namecase() would become
> slightly simpler: it's same as c_name() except for types, where it
> additionally folds to lower case.  The name would be misleading then,
> however.  Anyway, we can consider that once we enforce naming
> conventions.

And enforcing conventions certainly won't happen any sooner than the
rest of the pending queue is flushed.
diff mbox

Patch

diff --git a/scripts/qapi.py b/scripts/qapi.py
index 5207c12..114e07a 100644
--- a/scripts/qapi.py
+++ b/scripts/qapi.py
@@ -13,7 +13,6 @@ 

 import re
 from ordereddict import OrderedDict
-from operator import attrgetter
 import errno
 import getopt
 import os
@@ -381,7 +380,7 @@  def check_name(expr_info, source, name, allow_optional=False,
         membername = 'D' + membername
     # Reserve the entire 'q_'/'Q_' namespace for c_name()
     if not valid_name.match(membername) or \
-       c_name(membername, False).upper().startswith('Q_'):
+       c_name(membername, False).lower().startswith('q_'):
         raise QAPIExprError(expr_info,
                             "%s uses invalid name '%s'" % (source, name))

@@ -801,6 +800,9 @@  class QAPISchemaEntity(object):
     def c_name(self):
         return c_name(self.name)

+    def c_namecase(self):
+        return c_name(self.name).lower()
+
     def check(self, schema):
         pass

@@ -1041,7 +1043,7 @@  class QAPISchemaObjectTypeMember(object):
         assert self.type

     def check_clash(self, info, seen):
-        cname = c_name(self.name).upper()
+        cname = c_name(self.name).lower()
         if cname in seen:
             raise QAPIExprError(info,
                                 "%s collides with %s"
@@ -1086,7 +1088,7 @@  class QAPISchemaObjectTypeVariants(object):

     def check(self, schema, seen):
         if not self.tag_member:    # flat union
-            self.tag_member = seen[c_name(self.tag_name).upper()]
+            self.tag_member = seen[c_name(self.tag_name).lower()]
             assert self.tag_name == self.tag_member.name
         assert isinstance(self.tag_member.type, QAPISchemaEnumType)
         for v in self.variants:
@@ -1185,14 +1187,16 @@  class QAPISchemaEvent(QAPISchemaEntity):
     def visit(self, visitor):
         visitor.visit_event(self.name, self.info, self.arg_type)

+    def c_namecase(self):
+        return c_name(self.name).upper()
+

 class QAPISchema(object):
     def __init__(self, fname):
         try:
             self.exprs = check_exprs(QAPISchemaParser(open(fname, "r")).exprs)
-            # _entity_dict holds two namespaces: events are stored via
-            # name.upper(), commands and types are stored via name.lower().
             self._entity_dict = {}
+            self._clash_dict = {}
             self._predefining = True
             self._def_predefineds()
             self._predefining = False
@@ -1205,32 +1209,17 @@  class QAPISchema(object):
     def _def_entity(self, ent):
         # Only the predefined types are allowed to not have info
         assert ent.info or self._predefining
-        # On insertion, we need to check for an exact match in either
-        # namespace, then for case collision in a single namespace
-        if self.lookup_entity(ent.name):
-            raise QAPIExprError(ent.info,
-                                "Entity '%s' already defined" % end.name)
-        cname = c_name(ent.name)
-        if isinstance(ent, QAPISchemaEvent):
-            cname = cname.upper()
-        else:
-            cname = cname.lower()
-        if cname in self._entity_dict:
+        assert ent.name not in self._entity_dict
+        cname = ent.c_namecase()
+        if cname in self._clash_dict:
             raise QAPIExprError(ent.info,
                                 "Entity '%s' collides with '%s'"
-                                % (ent.name, self._entity_dict[cname].name))
-        self._entity_dict[cname] = ent
+                                % (ent.name, self._clash_dict[cname]))
+        self._entity_dict[ent.name] = ent
+        self._clash_dict[cname] = ent.name

     def lookup_entity(self, name, typ=None):
-        # No thanks to 'stop'/'STOP', we have to check two namespaces during
-        # lookups, but only return a result on exact match.
-        ent1 = self._entity_dict.get(c_name(name).lower())   # commands, types
-        ent2 = self._entity_dict.get(c_name(name).upper())   # events
-        ent = None
-        if ent1 and ent1.name == name:
-            ent = ent1
-        elif ent2 and ent2.name == name:
-            ent = ent2
+        ent = self._entity_dict.get(name)
         if typ and not isinstance(ent, typ):
             return None
         return ent
@@ -1414,8 +1403,7 @@  class QAPISchema(object):

     def visit(self, visitor):
         visitor.visit_begin(self)
-        for entity in sorted(self._entity_dict.values(),
-                             key=attrgetter('name')):
+        for (name, entity) in sorted(self._entity_dict.items()):
             if visitor.visit_needed(entity):
                 entity.visit(visitor)
         visitor.visit_end()