Message ID | 1447836791-369-22-git-send-email-eblake@redhat.com |
---|---|
State | New |
Headers | show |
Eric Blake <eblake@redhat.com> writes: > We already documented that qapi names should match specific > patterns (such as starting with a letter unless it was an enum > value or a downstream extension). Tighten that from a suggestion > into a hard requirement, which frees up names beginning with a > single underscore for qapi internal usage. If we care enough about naming conventions to document them, enforcing them makes only sense. > Also restrict things > to avoid 'a__b' or 'a--b' (that is, dash and underscore must not > occur consecutively). I feel this is entering "foolish names" territory, where things like "IcAnFiNdNaMeSeVeNlEsSrEaDaBlEtHaNStudlyCaps" live. Catching fools automatically is generally futile, they're too creative :) Let's drop this part. > Add a new test, reserved-member-underscore, to demonstrate the > tighter checking. > > Signed-off-by: Eric Blake <eblake@redhat.com> > > --- > v12: new patch > --- > docs/qapi-code-gen.txt | 19 ++++++++++--------- > scripts/qapi.py | 12 +++++++----- > tests/Makefile | 1 + > tests/qapi-schema/reserved-member-underscore.err | 1 + > tests/qapi-schema/reserved-member-underscore.exit | 1 + > tests/qapi-schema/reserved-member-underscore.json | 4 ++++ > tests/qapi-schema/reserved-member-underscore.out | 0 > 7 files changed, 24 insertions(+), 14 deletions(-) > create mode 100644 tests/qapi-schema/reserved-member-underscore.err > create mode 100644 tests/qapi-schema/reserved-member-underscore.exit > create mode 100644 tests/qapi-schema/reserved-member-underscore.json > create mode 100644 tests/qapi-schema/reserved-member-underscore.out > > diff --git a/docs/qapi-code-gen.txt b/docs/qapi-code-gen.txt > index ceb9a78..ec225d1 100644 > --- a/docs/qapi-code-gen.txt > +++ b/docs/qapi-code-gen.txt > @@ -118,17 +118,18 @@ tracking optional fields. > > Any name (command, event, type, field, or enum value) beginning with > "x-" is marked experimental, and may be withdrawn or changed > -incompatibly in a future release. Downstream vendors may add > -extensions; such extensions should begin with a prefix matching > +incompatibly in a future release. All names must begin with a letter, > +and contain only ASCII letters, digits, dash, and underscore, where > +dash and underscore cannot occur consecutively. There are two > +exceptions: enum values may start with a digit, and any extensions > +added by downstream vendors should start with a prefix matching > "__RFQDN_" (for the reverse-fully-qualified-domain-name of the > vendor), even if the rest of the name uses dash (example: > -__com.redhat_drive-mirror). Other than downstream extensions (with > -leading underscore and the use of dots), all names should begin with a > -letter, and contain only ASCII letters, digits, dash, and underscore. > -Names beginning with 'q_' are reserved for the generator: QMP names > -that resemble C keywords or other problematic strings will be munged > -in C to use this prefix. For example, a field named "default" in > -qapi becomes "q_default" in the generated C code. > +__com.redhat_drive-mirror). Names beginning with 'q_' are reserved > +for the generator: QMP names that resemble C keywords or other > +problematic strings will be munged in C to use this prefix. For > +example, a field named "default" in qapi becomes "q_default" in the > +generated C code. > > In the rest of this document, usage lines are given for each > expression type, with literal strings written in lower case and > diff --git a/scripts/qapi.py b/scripts/qapi.py > index 4a30bc0..e6d014b 100644 > --- a/scripts/qapi.py > +++ b/scripts/qapi.py > @@ -353,9 +353,11 @@ def discriminator_find_enum_define(expr): > return find_enum(discriminator_type) > > > -# FIXME should enforce "other than downstream extensions [...], all > -# names should begin with a letter". > -valid_name = re.compile('^[a-zA-Z_][a-zA-Z0-9_.-]*$') > +# Names must be letters, numbers, -, and _. They must start with letter, > +# except for downstream extensions which must start with __RFQDN_. > +# Dots are only valid in the downstream extension prefix. > +valid_name = re.compile('^(__[a-zA-Z][a-zA-Z0-9.]*_)?' > + '[a-zA-Z][a-zA-Z0-9]*([_-][a-zA-Z0-9]+)*$') This regexp consists of two parts: optional __RFQDN_ prefix and the name proper. The latter stays simpler if we don't attempt to catch adjacent [-_]. The former isn't quite right. According to RFC 822 Appendix 1 - Domain Name Syntax Specification: * We must accept '-' in addition to digits. * We require RFQDN to start with a letter. This is still a loose match. The tight match is "labels separated by dots", where labels consist of letters, digits and '-', starting with a letter. I wouldn't bother checking first characters are letters at all. Recommend valid_name = re.compile('^(__[a-zA-Z0-9.-]+_)?' '[a-zA-Z][a-zA-Z0-9_-]*$') > > > def check_name(expr_info, source, name, allow_optional=False, > @@ -374,8 +376,8 @@ def check_name(expr_info, source, name, allow_optional=False, > % (source, name)) > # Enum members can start with a digit, because the generated C > # code always prefixes it with the enum name > - if enum_member: > - membername = '_' + membername > + if enum_member and membername[0].isdigit(): What's wrong with the old condition? > + membername = 'D' + membername > # Reserve the entire 'q_' namespace for c_name() > if not valid_name.match(membername) or \ > c_name(membername, False).startswith('q_'): > diff --git a/tests/Makefile b/tests/Makefile > index 1c2c8ee..b70d246 100644 > --- a/tests/Makefile > +++ b/tests/Makefile > @@ -320,6 +320,7 @@ qapi-schema += reserved-command-q.json > qapi-schema += reserved-member-has.json > qapi-schema += reserved-member-q.json > qapi-schema += reserved-member-u.json > +qapi-schema += reserved-member-underscore.json > qapi-schema += reserved-type-kind.json > qapi-schema += reserved-type-list.json > qapi-schema += returns-alternate.json > diff --git a/tests/qapi-schema/reserved-member-underscore.err b/tests/qapi-schema/reserved-member-underscore.err > new file mode 100644 > index 0000000..65ff0da > --- /dev/null > +++ b/tests/qapi-schema/reserved-member-underscore.err > @@ -0,0 +1 @@ > +tests/qapi-schema/reserved-member-underscore.json:4: Member of 'data' for struct 'Oops' uses invalid name '_oops' > diff --git a/tests/qapi-schema/reserved-member-underscore.exit b/tests/qapi-schema/reserved-member-underscore.exit > new file mode 100644 > index 0000000..d00491f > --- /dev/null > +++ b/tests/qapi-schema/reserved-member-underscore.exit > @@ -0,0 +1 @@ > +1 > diff --git a/tests/qapi-schema/reserved-member-underscore.json b/tests/qapi-schema/reserved-member-underscore.json > new file mode 100644 > index 0000000..4a3a017 > --- /dev/null > +++ b/tests/qapi-schema/reserved-member-underscore.json > @@ -0,0 +1,4 @@ > +# C member name collision > +# We reject use of a single leading underscore in all names (names must > +# begin with a letter or a downstream extension double-underscore prefix). > +{ 'struct': 'Oops', 'data': { '_oops': 'str' } } > diff --git a/tests/qapi-schema/reserved-member-underscore.out b/tests/qapi-schema/reserved-member-underscore.out > new file mode 100644 > index 0000000..e69de29
On 11/18/2015 04:51 AM, Markus Armbruster wrote: > Eric Blake <eblake@redhat.com> writes: > >> We already documented that qapi names should match specific >> patterns (such as starting with a letter unless it was an enum >> value or a downstream extension). Tighten that from a suggestion >> into a hard requirement, which frees up names beginning with a >> single underscore for qapi internal usage. > > If we care enough about naming conventions to document them, enforcing > them makes only sense. > >> Also restrict things >> to avoid 'a__b' or 'a--b' (that is, dash and underscore must not >> occur consecutively). > > I feel this is entering "foolish names" territory, where things like > "IcAnFiNdNaMeSeVeNlEsSrEaDaBlEtHaNStudlyCaps" live. Catching fools IhAdToOmUcHfUnReAdInGtHaT [It's amazing that scholars can ever manage to correctly interpret old written languages, thanks to omitted word breaks (scriptio continua) or omitted vowels (such as in Hebrew)] > automatically is generally futile, they're too creative :) > > Let's drop this part. Sure, I can do that. I'll post a fixup patch, as it will affect docs too. > >> Add a new test, reserved-member-underscore, to demonstrate the >> tighter checking. >> >> Signed-off-by: Eric Blake <eblake@redhat.com> >> >> --- >> v12: new patch >> --- >> +incompatibly in a future release. All names must begin with a letter, >> +and contain only ASCII letters, digits, dash, and underscore, where >> +dash and underscore cannot occur consecutively. There are two Namely, right here. >> +# Names must be letters, numbers, -, and _. They must start with letter, >> +# except for downstream extensions which must start with __RFQDN_. >> +# Dots are only valid in the downstream extension prefix. >> +valid_name = re.compile('^(__[a-zA-Z][a-zA-Z0-9.]*_)?' >> + '[a-zA-Z][a-zA-Z0-9]*([_-][a-zA-Z0-9]+)*$') > > This regexp consists of two parts: optional __RFQDN_ prefix and the name > proper. > > The latter stays simpler if we don't attempt to catch adjacent [-_]. > > The former isn't quite right. According to RFC 822 Appendix 1 - Domain > Name Syntax Specification: > > * We must accept '-' in addition to digits. > > * We require RFQDN to start with a letter. This is still a loose match. > The tight match is "labels separated by dots", where labels consist of > letters, digits and '-', starting with a letter. I wouldn't bother > checking first characters are letters at all. > > Recommend > > valid_name = re.compile('^(__[a-zA-Z0-9.-]+_)?' > '[a-zA-Z][a-zA-Z0-9_-]*$') Sure, that works for me. It's tighter than what we had before, but looser than what I proposed so that it allows more RFQDN. It potentially lets users confuse us by abusing 'foo__max' or similar, but I'm less worried about that abuse - it's okay to stop the blatantly obvious mistakes without worrying about the corner cases, if it makes the maintenance easier for the cases we do catch. > >> >> >> def check_name(expr_info, source, name, allow_optional=False, >> @@ -374,8 +376,8 @@ def check_name(expr_info, source, name, allow_optional=False, >> % (source, name)) >> # Enum members can start with a digit, because the generated C >> # code always prefixes it with the enum name >> - if enum_member: >> - membername = '_' + membername >> + if enum_member and membername[0].isdigit(): > > What's wrong with the old condition? > >> + membername = 'D' + membername The old code prepended a lone '_', which doesn't work any more with the tighter valid_name regex, so we have to use some other prefix (I picked 'D'). >> # Reserve the entire 'q_' namespace for c_name() >> if not valid_name.match(membername) or \ >> c_name(membername, False).startswith('q_'): The other thing is that I realized that an enum value of 'q-int' was getting munged to '_q-int' which no longer gets flagged by the c_name() filter. But neither would 'Dq-int' get flagged. So limiting the munging to just enum values that start with a digit (where we know it doesn't start with q) means we don't weaken the second condition. I guess I need to call that out in the commit message; all the more reason for me to send a fixup.
Eric Blake <eblake@redhat.com> writes: > On 11/18/2015 04:51 AM, Markus Armbruster wrote: >> Eric Blake <eblake@redhat.com> writes: >> >>> We already documented that qapi names should match specific >>> patterns (such as starting with a letter unless it was an enum >>> value or a downstream extension). Tighten that from a suggestion >>> into a hard requirement, which frees up names beginning with a >>> single underscore for qapi internal usage. >> >> If we care enough about naming conventions to document them, enforcing >> them makes only sense. >> >>> Also restrict things >>> to avoid 'a__b' or 'a--b' (that is, dash and underscore must not >>> occur consecutively). >> >> I feel this is entering "foolish names" territory, where things like >> "IcAnFiNdNaMeSeVeNlEsSrEaDaBlEtHaNStudlyCaps" live. Catching fools > > IhAdToOmUcHfUnReAdInGtHaT > > [It's amazing that scholars can ever manage to correctly interpret old > written languages, thanks to omitted word breaks (scriptio continua) or > omitted vowels (such as in Hebrew)] Indeed. >> automatically is generally futile, they're too creative :) >> >> Let's drop this part. > > Sure, I can do that. I'll post a fixup patch, as it will affect docs too. > >> >>> Add a new test, reserved-member-underscore, to demonstrate the >>> tighter checking. >>> >>> Signed-off-by: Eric Blake <eblake@redhat.com> >>> >>> --- >>> v12: new patch >>> --- > >>> +incompatibly in a future release. All names must begin with a letter, >>> +and contain only ASCII letters, digits, dash, and underscore, where >>> +dash and underscore cannot occur consecutively. There are two > > Namely, right here. > >>> +# Names must be letters, numbers, -, and _. They must start with letter, >>> +# except for downstream extensions which must start with __RFQDN_. >>> +# Dots are only valid in the downstream extension prefix. >>> +valid_name = re.compile('^(__[a-zA-Z][a-zA-Z0-9.]*_)?' >>> + '[a-zA-Z][a-zA-Z0-9]*([_-][a-zA-Z0-9]+)*$') >> >> This regexp consists of two parts: optional __RFQDN_ prefix and the name >> proper. >> >> The latter stays simpler if we don't attempt to catch adjacent [-_]. >> >> The former isn't quite right. According to RFC 822 Appendix 1 - Domain >> Name Syntax Specification: >> >> * We must accept '-' in addition to digits. >> >> * We require RFQDN to start with a letter. This is still a loose match. >> The tight match is "labels separated by dots", where labels consist of >> letters, digits and '-', starting with a letter. I wouldn't bother >> checking first characters are letters at all. >> >> Recommend >> >> valid_name = re.compile('^(__[a-zA-Z0-9.-]+_)?' >> '[a-zA-Z][a-zA-Z0-9_-]*$') > > Sure, that works for me. It's tighter than what we had before, but > looser than what I proposed so that it allows more RFQDN. It > potentially lets users confuse us by abusing 'foo__max' or similar, but > I'm less worried about that abuse - it's okay to stop the blatantly > obvious mistakes without worrying about the corner cases, if it makes > the maintenance easier for the cases we do catch. > >> >>> >>> >>> def check_name(expr_info, source, name, allow_optional=False, >>> @@ -374,8 +376,8 @@ def check_name(expr_info, source, name, allow_optional=False, >>> % (source, name)) >>> # Enum members can start with a digit, because the generated C >>> # code always prefixes it with the enum name >>> - if enum_member: >>> - membername = '_' + membername >>> + if enum_member and membername[0].isdigit(): >> >> What's wrong with the old condition? >> >>> + membername = 'D' + membername > > The old code prepended a lone '_', which doesn't work any more with the > tighter valid_name regex, so we have to use some other prefix (I picked > 'D'). > >>> # Reserve the entire 'q_' namespace for c_name() >>> if not valid_name.match(membername) or \ >>> c_name(membername, False).startswith('q_'): > > The other thing is that I realized that an enum value of 'q-int' was > getting munged to '_q-int' which no longer gets flagged by the c_name() > filter. But neither would 'Dq-int' get flagged. So limiting the > munging to just enum values that start with a digit (where we know it > doesn't start with q) means we don't weaken the second condition. Aha! Worth a comment, I think. > I guess I need to call that out in the commit message; all the more > reason for me to send a fixup. Yes, please!
diff --git a/docs/qapi-code-gen.txt b/docs/qapi-code-gen.txt index ceb9a78..ec225d1 100644 --- a/docs/qapi-code-gen.txt +++ b/docs/qapi-code-gen.txt @@ -118,17 +118,18 @@ tracking optional fields. Any name (command, event, type, field, or enum value) beginning with "x-" is marked experimental, and may be withdrawn or changed -incompatibly in a future release. Downstream vendors may add -extensions; such extensions should begin with a prefix matching +incompatibly in a future release. All names must begin with a letter, +and contain only ASCII letters, digits, dash, and underscore, where +dash and underscore cannot occur consecutively. There are two +exceptions: enum values may start with a digit, and any extensions +added by downstream vendors should start with a prefix matching "__RFQDN_" (for the reverse-fully-qualified-domain-name of the vendor), even if the rest of the name uses dash (example: -__com.redhat_drive-mirror). Other than downstream extensions (with -leading underscore and the use of dots), all names should begin with a -letter, and contain only ASCII letters, digits, dash, and underscore. -Names beginning with 'q_' are reserved for the generator: QMP names -that resemble C keywords or other problematic strings will be munged -in C to use this prefix. For example, a field named "default" in -qapi becomes "q_default" in the generated C code. +__com.redhat_drive-mirror). Names beginning with 'q_' are reserved +for the generator: QMP names that resemble C keywords or other +problematic strings will be munged in C to use this prefix. For +example, a field named "default" in qapi becomes "q_default" in the +generated C code. In the rest of this document, usage lines are given for each expression type, with literal strings written in lower case and diff --git a/scripts/qapi.py b/scripts/qapi.py index 4a30bc0..e6d014b 100644 --- a/scripts/qapi.py +++ b/scripts/qapi.py @@ -353,9 +353,11 @@ def discriminator_find_enum_define(expr): return find_enum(discriminator_type) -# FIXME should enforce "other than downstream extensions [...], all -# names should begin with a letter". -valid_name = re.compile('^[a-zA-Z_][a-zA-Z0-9_.-]*$') +# Names must be letters, numbers, -, and _. They must start with letter, +# except for downstream extensions which must start with __RFQDN_. +# Dots are only valid in the downstream extension prefix. +valid_name = re.compile('^(__[a-zA-Z][a-zA-Z0-9.]*_)?' + '[a-zA-Z][a-zA-Z0-9]*([_-][a-zA-Z0-9]+)*$') def check_name(expr_info, source, name, allow_optional=False, @@ -374,8 +376,8 @@ def check_name(expr_info, source, name, allow_optional=False, % (source, name)) # Enum members can start with a digit, because the generated C # code always prefixes it with the enum name - if enum_member: - membername = '_' + membername + if enum_member and membername[0].isdigit(): + membername = 'D' + membername # Reserve the entire 'q_' namespace for c_name() if not valid_name.match(membername) or \ c_name(membername, False).startswith('q_'): diff --git a/tests/Makefile b/tests/Makefile index 1c2c8ee..b70d246 100644 --- a/tests/Makefile +++ b/tests/Makefile @@ -320,6 +320,7 @@ qapi-schema += reserved-command-q.json qapi-schema += reserved-member-has.json qapi-schema += reserved-member-q.json qapi-schema += reserved-member-u.json +qapi-schema += reserved-member-underscore.json qapi-schema += reserved-type-kind.json qapi-schema += reserved-type-list.json qapi-schema += returns-alternate.json diff --git a/tests/qapi-schema/reserved-member-underscore.err b/tests/qapi-schema/reserved-member-underscore.err new file mode 100644 index 0000000..65ff0da --- /dev/null +++ b/tests/qapi-schema/reserved-member-underscore.err @@ -0,0 +1 @@ +tests/qapi-schema/reserved-member-underscore.json:4: Member of 'data' for struct 'Oops' uses invalid name '_oops' diff --git a/tests/qapi-schema/reserved-member-underscore.exit b/tests/qapi-schema/reserved-member-underscore.exit new file mode 100644 index 0000000..d00491f --- /dev/null +++ b/tests/qapi-schema/reserved-member-underscore.exit @@ -0,0 +1 @@ +1 diff --git a/tests/qapi-schema/reserved-member-underscore.json b/tests/qapi-schema/reserved-member-underscore.json new file mode 100644 index 0000000..4a3a017 --- /dev/null +++ b/tests/qapi-schema/reserved-member-underscore.json @@ -0,0 +1,4 @@ +# C member name collision +# We reject use of a single leading underscore in all names (names must +# begin with a letter or a downstream extension double-underscore prefix). +{ 'struct': 'Oops', 'data': { '_oops': 'str' } }
We already documented that qapi names should match specific patterns (such as starting with a letter unless it was an enum value or a downstream extension). Tighten that from a suggestion into a hard requirement, which frees up names beginning with a single underscore for qapi internal usage. Also restrict things to avoid 'a__b' or 'a--b' (that is, dash and underscore must not occur consecutively). Add a new test, reserved-member-underscore, to demonstrate the tighter checking. Signed-off-by: Eric Blake <eblake@redhat.com> --- v12: new patch --- docs/qapi-code-gen.txt | 19 ++++++++++--------- scripts/qapi.py | 12 +++++++----- tests/Makefile | 1 + tests/qapi-schema/reserved-member-underscore.err | 1 + tests/qapi-schema/reserved-member-underscore.exit | 1 + tests/qapi-schema/reserved-member-underscore.json | 4 ++++ tests/qapi-schema/reserved-member-underscore.out | 0 7 files changed, 24 insertions(+), 14 deletions(-) create mode 100644 tests/qapi-schema/reserved-member-underscore.err create mode 100644 tests/qapi-schema/reserved-member-underscore.exit create mode 100644 tests/qapi-schema/reserved-member-underscore.json create mode 100644 tests/qapi-schema/reserved-member-underscore.out diff --git a/tests/qapi-schema/reserved-member-underscore.out b/tests/qapi-schema/reserved-member-underscore.out new file mode 100644 index 0000000..e69de29