Message ID | 1343233543-18561-2-git-send-email-aliguori@us.ibm.com |
---|---|
State | New |
Headers | show |
On 07/25/2012 10:25 AM, Anthony Liguori wrote: > We don't use the standard C functions for conversion because we don't want to > depend on the user's locale. All option names in QEMU are en_US in plain ASCII. [Wondering if I should bring up the US 'canceled' vs. UK 'cancelled' as a counterpoint to the claim of everything being en_US, but then again, I didn't find any use of 'cancelled' as an option in qapi-schema.json.] > > Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> > --- > qemu-option.c | 53 +++++++++++++++++++++++++++++++++++++++++++++-------- > qemu-option.h | 2 ++ > 2 files changed, 47 insertions(+), 8 deletions(-) > > diff --git a/qemu-option.c b/qemu-option.c > index 8334190..6494c99 100644 > --- a/qemu-option.c > +++ b/qemu-option.c > @@ -89,6 +89,43 @@ const char *get_opt_value(char *buf, int buf_size, const char *p) > return p; > } > > +static int opt_tolower(int ch) > +{ > + if (ch >= 'A' && ch <= 'Z') { > + return 'a' + (ch - 'A'); > + } else if (ch == '_') { > + return '-'; Slick - making case-insensitive comparison also fold '-' and '_' together :) Reviewed-by: Eric Blake <eblake@redhat.com>
On 07/25/2012 10:45 AM, Eric Blake wrote: > On 07/25/2012 10:25 AM, Anthony Liguori wrote: >> We don't use the standard C functions for conversion because we don't want to >> depend on the user's locale. All option names in QEMU are en_US in plain ASCII. > >> >> +static int opt_tolower(int ch) >> +{ >> + if (ch >= 'A' && ch <= 'Z') { >> + return 'a' + (ch - 'A'); P.S. This is not portable to EBCDIC, but I guess we don't care about compilation of qemu on a non-ASCII machine, so my review still stands. > Reviewed-by: Eric Blake <eblake@redhat.com> >
On 25 July 2012 17:25, Anthony Liguori <aliguori@us.ibm.com> wrote: > We don't use the standard C functions for conversion because we don't want to > depend on the user's locale. All option names in QEMU are en_US in plain ASCII. > > Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> > --- > qemu-option.c | 53 +++++++++++++++++++++++++++++++++++++++++++++-------- > qemu-option.h | 2 ++ > 2 files changed, 47 insertions(+), 8 deletions(-) > > diff --git a/qemu-option.c b/qemu-option.c > index 8334190..6494c99 100644 > --- a/qemu-option.c > +++ b/qemu-option.c > @@ -89,6 +89,43 @@ const char *get_opt_value(char *buf, int buf_size, const char *p) > return p; > } > > +static int opt_tolower(int ch) This isn't actually doing a pure tolower() operation. opt_canonicalize() is a bit long though, perhaps. I guess it's static so not a big deal. +{ > + if (ch >= 'A' && ch <= 'Z') { > + return 'a' + (ch - 'A'); > + } else if (ch == '_') { > + return '-'; > + } > + > + return ch; > +} > + > +int qemu_opt_name_cmp(const char *lhs, const char *rhs) > +{ > + int i; > + > + g_assert(lhs && rhs); > + > + for (i = 0; lhs[i] && rhs[i]; i++) { > + int ret; > + > + ret = opt_tolower(lhs[i]) - opt_tolower(rhs[i]); This is not in line with the return value that the C library strcmp() would return. C99 7.21.4 says "The sign of a nonzero value returned by the comparison functions memcmp, strcmp, and strncmp is determined by the sign of the difference between the values of the first pair of characters (both interpreted as unsigned char) that differ in the objects being compared." This code will use whatever the signed/unsignedess of plain 'char' is. (None of the callers use the sign of the return value so this is something of a nitpick.) > + if (ret < 0) { > + return -1; > + } else if (ret > 0) { > + return 1; > + } > + } > + > + if (!lhs[i] && rhs[i]) { > + return -1; > + } else if (lhs[i] && !rhs[i]) { > + return 1; > + } If you made the for() loop into a do..while so that we execute the loop body for the 'final NUL' case you could avoid having this pair of checks, I think (and you can make the loop termination case just '!lhs[i]' since if we get past the 'mismatch' exits we've definitely got to the end of both strings and can return true). > + > + return 0; > +} > --- a/qemu-option.h > +++ b/qemu-option.h > @@ -141,4 +141,6 @@ int qemu_opts_print(QemuOpts *opts, void *dummy); > int qemu_opts_foreach(QemuOptsList *list, qemu_opts_loopfunc func, void *opaque, > int abort_on_failure); > > +int qemu_opt_name_cmp(const char *lhs, const char *rhs); No documentation comment? -- PMM
Peter Maydell <peter.maydell@linaro.org> writes: > On 25 July 2012 17:25, Anthony Liguori <aliguori@us.ibm.com> wrote: >> We don't use the standard C functions for conversion because we don't want to >> depend on the user's locale. All option names in QEMU are en_US in plain ASCII. >> >> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> >> --- >> qemu-option.c | 53 +++++++++++++++++++++++++++++++++++++++++++++-------- >> qemu-option.h | 2 ++ >> 2 files changed, 47 insertions(+), 8 deletions(-) >> >> diff --git a/qemu-option.c b/qemu-option.c >> index 8334190..6494c99 100644 >> --- a/qemu-option.c >> +++ b/qemu-option.c >> @@ -89,6 +89,43 @@ const char *get_opt_value(char *buf, int buf_size, const char *p) >> return p; >> } >> >> +static int opt_tolower(int ch) > > This isn't actually doing a pure tolower() operation. > opt_canonicalize() is a bit long though, perhaps. > I guess it's static so not a big deal. Yeah. > > +{ >> + if (ch >= 'A' && ch <= 'Z') { >> + return 'a' + (ch - 'A'); >> + } else if (ch == '_') { >> + return '-'; >> + } >> + >> + return ch; >> +} >> + >> +int qemu_opt_name_cmp(const char *lhs, const char *rhs) >> +{ >> + int i; >> + >> + g_assert(lhs && rhs); >> + >> + for (i = 0; lhs[i] && rhs[i]; i++) { >> + int ret; >> + >> + ret = opt_tolower(lhs[i]) - opt_tolower(rhs[i]); > > This is not in line with the return value that the C library > strcmp() would return. C99 7.21.4 says "The sign of a nonzero > value returned by the comparison functions memcmp, strcmp, > and strncmp is determined by the sign of the difference between > the values of the first pair of characters (both interpreted > as unsigned char) that differ in the objects being compared." > This code will use whatever the signed/unsignedess of plain > 'char' is. > > (None of the callers use the sign of the return value so this > is something of a nitpick.) Sorry, how is this wrong? This returns: strcmp("a", "b") -> -1 qemu_opt_name_cmp("a", "b") -> -1 >> + if (ret < 0) { >> + return -1; >> + } else if (ret > 0) { >> + return 1; >> + } >> + } >> + >> + if (!lhs[i] && rhs[i]) { >> + return -1; >> + } else if (lhs[i] && !rhs[i]) { >> + return 1; >> + } > > If you made the for() loop into a do..while so that we > execute the loop body for the 'final NUL' case you could > avoid having this pair of checks, I think (and you can > make the loop termination case just '!lhs[i]' since if > we get past the 'mismatch' exits we've definitely got > to the end of both strings and can return true). I can poke around but not convinced it will result in better code. I must admit that I prefer explicit handling of edge cases like this. > >> + >> + return 0; >> +} > >> --- a/qemu-option.h >> +++ b/qemu-option.h >> @@ -141,4 +141,6 @@ int qemu_opts_print(QemuOpts *opts, void *dummy); >> int qemu_opts_foreach(QemuOptsList *list, qemu_opts_loopfunc func, void *opaque, >> int abort_on_failure); >> >> +int qemu_opt_name_cmp(const char *lhs, const char *rhs); > > No documentation comment? Good point. Regards, Anthony Liguori > > -- PMM
Eric Blake <eblake@redhat.com> writes: > On 07/25/2012 10:45 AM, Eric Blake wrote: >> On 07/25/2012 10:25 AM, Anthony Liguori wrote: >>> We don't use the standard C functions for conversion because we don't want to >>> depend on the user's locale. All option names in QEMU are en_US in plain ASCII. >> >>> >>> +static int opt_tolower(int ch) >>> +{ >>> + if (ch >= 'A' && ch <= 'Z') { >>> + return 'a' + (ch - 'A'); > > P.S. This is not portable to EBCDIC, but I guess we don't care about > compilation of qemu on a non-ASCII machine, so my review still stands. Fortunately, even on S390, Linux uses ASCII under normal circumstances :-) Regards, Anthony Liguori > >> Reviewed-by: Eric Blake <eblake@redhat.com> >> > > -- > Eric Blake eblake@redhat.com +1-919-301-3266 > Libvirt virtualization library http://libvirt.org
On 25 July 2012 18:33, Anthony Liguori <aliguori@us.ibm.com> wrote: > Peter Maydell <peter.maydell@linaro.org> writes: >> This is not in line with the return value that the C library >> strcmp() would return. C99 7.21.4 says "The sign of a nonzero >> value returned by the comparison functions memcmp, strcmp, >> and strncmp is determined by the sign of the difference between >> the values of the first pair of characters (both interpreted >> as unsigned char) that differ in the objects being compared." >> This code will use whatever the signed/unsignedess of plain >> 'char' is. >> >> (None of the callers use the sign of the return value so this >> is something of a nitpick.) > > Sorry, how is this wrong? > > This returns: > > strcmp("a", "b") -> -1 > qemu_opt_name_cmp("a", "b") -> -1 strcmp("\x90", "\x50") -> 64 qemu_opt_name_cmp("\x90", "\x50") -> -1 (assuming a 'char is signed' architecture like x86, or use "gcc -fsigned-char" if by some miracle you do your qemu development on an ARM box :-)) >> If you made the for() loop into a do..while so that we >> execute the loop body for the 'final NUL' case you could >> avoid having this pair of checks, I think (and you can >> make the loop termination case just '!lhs[i]' since if >> we get past the 'mismatch' exits we've definitely got >> to the end of both strings and can return true). > > I can poke around but not convinced it will result in better code. I > must admit that I prefer explicit handling of edge cases like this. Typically these basic C library functions are arranged so that the edge cases mostly fall out in the wash (because the original implementation was done to be simple to code rather than to provide particular semantics), so if I see an implementation that does special case them it makes me wonder if it's getting something wrong. For instance the BSD naive strcmp is just four lines and no special casing apart from "stop if we ran out of string": while (*s1 == *s2++) if (*s1++ == '\0') return (0); return (*(const unsigned char *)s1 - *(const unsigned char *)(s2 - 1)); -- PMM
Anthony Liguori <aliguori@us.ibm.com> writes: > We don't use the standard C functions for conversion because we don't want to > depend on the user's locale. All option names in QEMU are en_US in plain ASCII. Fails to explain the important part, namely the actual change to option comparison! > Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> > --- > qemu-option.c | 53 +++++++++++++++++++++++++++++++++++++++++++++-------- > qemu-option.h | 2 ++ > 2 files changed, 47 insertions(+), 8 deletions(-) > > diff --git a/qemu-option.c b/qemu-option.c > index 8334190..6494c99 100644 > --- a/qemu-option.c > +++ b/qemu-option.c > @@ -89,6 +89,43 @@ const char *get_opt_value(char *buf, int buf_size, const char *p) > return p; > } > > +static int opt_tolower(int ch) > +{ > + if (ch >= 'A' && ch <= 'Z') { > + return 'a' + (ch - 'A'); > + } else if (ch == '_') { > + return '-'; > + } > + > + return ch; > +} > + > +int qemu_opt_name_cmp(const char *lhs, const char *rhs) > +{ > + int i; > + > + g_assert(lhs && rhs); Why bother? > + > + for (i = 0; lhs[i] && rhs[i]; i++) { > + int ret; > + > + ret = opt_tolower(lhs[i]) - opt_tolower(rhs[i]); > + if (ret < 0) { > + return -1; > + } else if (ret > 0) { > + return 1; > + } > + } > + > + if (!lhs[i] && rhs[i]) { > + return -1; > + } else if (lhs[i] && !rhs[i]) { > + return 1; > + } > + > + return 0; > +} > + Are you sure you want to make options case-insensitive? If yes, please mention it in the commit message. The implementation is too longwinded for my taste :) static unsigned char opt_canon_ch(char ch) { if (ch >= 'A' && ch <= 'Z') { return 'a' + (ch - 'A'); } else if (ch == '_') { return '-'; } return ch; } int qemu_opt_name_cmp(const char *lhs, const char *rhs) { unsigned char l, r; do { l = opt_canon_ch(*lhs++); r = opt_canon_ch(*rhs++); } while (l && l == r); return l - r; } Or maybe a bool qemu_opt_name_eq() instead. > int get_next_param_value(char *buf, int buf_size, > const char *tag, const char **pstr) > { > @@ -101,7 +138,7 @@ int get_next_param_value(char *buf, int buf_size, > if (*p != '=') > break; > p++; > - if (!strcmp(tag, option)) { > + if (!qemu_opt_name_cmp(tag, option)) { > *pstr = get_opt_value(buf, buf_size, p); > if (**pstr == ',') { > (*pstr)++; Aha, you cover the non-QemuOpts stuff, too. Fine with me. > @@ -137,7 +174,7 @@ int check_params(char *buf, int buf_size, > } > p++; > for (i = 0; params[i] != NULL; i++) { > - if (!strcmp(params[i], buf)) { > + if (!qemu_opt_name_cmp(params[i], buf)) { > break; > } > } > @@ -160,7 +197,7 @@ QEMUOptionParameter *get_option_parameter(QEMUOptionParameter *list, > const char *name) > { > while (list && list->name) { > - if (!strcmp(list->name, name)) { > + if (!qemu_opt_name_cmp(list->name, name)) { > return list; > } > list++; > @@ -516,7 +553,7 @@ static QemuOpt *qemu_opt_find(QemuOpts *opts, const char *name) > QemuOpt *opt; > > QTAILQ_FOREACH_REVERSE(opt, &opts->head, QemuOptHead, next) { > - if (strcmp(opt->name, name) != 0) > + if (qemu_opt_name_cmp(opt->name, name) != 0) > continue; > return opt; > } > @@ -599,7 +636,7 @@ static void opt_set(QemuOpts *opts, const char *name, const char *value, > int i; > > for (i = 0; desc[i].name != NULL; i++) { > - if (strcmp(desc[i].name, name) == 0) { > + if (qemu_opt_name_cmp(desc[i].name, name) == 0) { > break; > } > } > @@ -660,7 +697,7 @@ int qemu_opt_set_bool(QemuOpts *opts, const char *name, bool val) > int i; > > for (i = 0; desc[i].name != NULL; i++) { > - if (strcmp(desc[i].name, name) == 0) { > + if (qemu_opt_name_cmp(desc[i].name, name) == 0) { > break; > } > } > @@ -709,7 +746,7 @@ QemuOpts *qemu_opts_find(QemuOptsList *list, const char *id) > } > continue; > } > - if (strcmp(opts->id, id) != 0) { > + if (qemu_opt_name_cmp(opts->id, id) != 0) { > continue; > } > return opts; There's a strcmp(option, "id") in opts_do_parse(), and a similar one in qemu_opts_from_qdict_1(). Perhaps they should be compared with qemu_opt_name_cmp(), too, just for consistency. > @@ -1062,7 +1099,7 @@ void qemu_opts_validate(QemuOpts *opts, const QemuOptDesc *desc, Error **errp) > int i; > > for (i = 0; desc[i].name != NULL; i++) { > - if (strcmp(desc[i].name, opt->name) == 0) { > + if (qemu_opt_name_cmp(desc[i].name, opt->name) == 0) { > break; > } > } > diff --git a/qemu-option.h b/qemu-option.h > index 951dec3..ee27a13 100644 > --- a/qemu-option.h > +++ b/qemu-option.h > @@ -141,4 +141,6 @@ int qemu_opts_print(QemuOpts *opts, void *dummy); > int qemu_opts_foreach(QemuOptsList *list, qemu_opts_loopfunc func, void *opaque, > int abort_on_failure); > > +int qemu_opt_name_cmp(const char *lhs, const char *rhs); > + > #endif Why external?
On 27 July 2012 08:44, Markus Armbruster <armbru@redhat.com> wrote: > The implementation is too longwinded for my taste :) > > static unsigned char opt_canon_ch(char ch) > { > if (ch >= 'A' && ch <= 'Z') { > return 'a' + (ch - 'A'); > } else if (ch == '_') { > return '-'; > } > > return ch; > } > > int qemu_opt_name_cmp(const char *lhs, const char *rhs) > { > unsigned char l, r; > > do { > l = opt_canon_ch(*lhs++); > r = opt_canon_ch(*rhs++); > } while (l && l == r); > > return l - r; > } Yes, I like the function naming and this implementation is both shorter and more obviously correct. -- PMM
diff --git a/qemu-option.c b/qemu-option.c index 8334190..6494c99 100644 --- a/qemu-option.c +++ b/qemu-option.c @@ -89,6 +89,43 @@ const char *get_opt_value(char *buf, int buf_size, const char *p) return p; } +static int opt_tolower(int ch) +{ + if (ch >= 'A' && ch <= 'Z') { + return 'a' + (ch - 'A'); + } else if (ch == '_') { + return '-'; + } + + return ch; +} + +int qemu_opt_name_cmp(const char *lhs, const char *rhs) +{ + int i; + + g_assert(lhs && rhs); + + for (i = 0; lhs[i] && rhs[i]; i++) { + int ret; + + ret = opt_tolower(lhs[i]) - opt_tolower(rhs[i]); + if (ret < 0) { + return -1; + } else if (ret > 0) { + return 1; + } + } + + if (!lhs[i] && rhs[i]) { + return -1; + } else if (lhs[i] && !rhs[i]) { + return 1; + } + + return 0; +} + int get_next_param_value(char *buf, int buf_size, const char *tag, const char **pstr) { @@ -101,7 +138,7 @@ int get_next_param_value(char *buf, int buf_size, if (*p != '=') break; p++; - if (!strcmp(tag, option)) { + if (!qemu_opt_name_cmp(tag, option)) { *pstr = get_opt_value(buf, buf_size, p); if (**pstr == ',') { (*pstr)++; @@ -137,7 +174,7 @@ int check_params(char *buf, int buf_size, } p++; for (i = 0; params[i] != NULL; i++) { - if (!strcmp(params[i], buf)) { + if (!qemu_opt_name_cmp(params[i], buf)) { break; } } @@ -160,7 +197,7 @@ QEMUOptionParameter *get_option_parameter(QEMUOptionParameter *list, const char *name) { while (list && list->name) { - if (!strcmp(list->name, name)) { + if (!qemu_opt_name_cmp(list->name, name)) { return list; } list++; @@ -516,7 +553,7 @@ static QemuOpt *qemu_opt_find(QemuOpts *opts, const char *name) QemuOpt *opt; QTAILQ_FOREACH_REVERSE(opt, &opts->head, QemuOptHead, next) { - if (strcmp(opt->name, name) != 0) + if (qemu_opt_name_cmp(opt->name, name) != 0) continue; return opt; } @@ -599,7 +636,7 @@ static void opt_set(QemuOpts *opts, const char *name, const char *value, int i; for (i = 0; desc[i].name != NULL; i++) { - if (strcmp(desc[i].name, name) == 0) { + if (qemu_opt_name_cmp(desc[i].name, name) == 0) { break; } } @@ -660,7 +697,7 @@ int qemu_opt_set_bool(QemuOpts *opts, const char *name, bool val) int i; for (i = 0; desc[i].name != NULL; i++) { - if (strcmp(desc[i].name, name) == 0) { + if (qemu_opt_name_cmp(desc[i].name, name) == 0) { break; } } @@ -709,7 +746,7 @@ QemuOpts *qemu_opts_find(QemuOptsList *list, const char *id) } continue; } - if (strcmp(opts->id, id) != 0) { + if (qemu_opt_name_cmp(opts->id, id) != 0) { continue; } return opts; @@ -1062,7 +1099,7 @@ void qemu_opts_validate(QemuOpts *opts, const QemuOptDesc *desc, Error **errp) int i; for (i = 0; desc[i].name != NULL; i++) { - if (strcmp(desc[i].name, opt->name) == 0) { + if (qemu_opt_name_cmp(desc[i].name, opt->name) == 0) { break; } } diff --git a/qemu-option.h b/qemu-option.h index 951dec3..ee27a13 100644 --- a/qemu-option.h +++ b/qemu-option.h @@ -141,4 +141,6 @@ int qemu_opts_print(QemuOpts *opts, void *dummy); int qemu_opts_foreach(QemuOptsList *list, qemu_opts_loopfunc func, void *opaque, int abort_on_failure); +int qemu_opt_name_cmp(const char *lhs, const char *rhs); + #endif
We don't use the standard C functions for conversion because we don't want to depend on the user's locale. All option names in QEMU are en_US in plain ASCII. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> --- qemu-option.c | 53 +++++++++++++++++++++++++++++++++++++++++++++-------- qemu-option.h | 2 ++ 2 files changed, 47 insertions(+), 8 deletions(-)