Message ID | 20180209130754.16006-1-ricaljasan@pacific.net |
---|---|
State | New |
Headers | show |
Series | manual: Document the standardized scanf flag, "m". [BZ #16376] | expand |
On Fri, Feb 9, 2018 at 8:07 AM, Rical Jasan <ricaljasan@pacific.net> wrote: > POSIX.1-2008 introduced the optional assignment-allocation modifier, > "m", whose functionality was previously provided by the GNU extension > "a". OK with just one tweak... > +You should free the buffer with @code{free} when you no longer need it. > + > +As a GNU extension predating @samp{m}, @samp{a} is also available, but > +its use is considered deprecated. let's be a little more specific here: +As a GNU extension, the modifier @samp{a} has the same effect as @samp{m}. +This extension predates POSIX.1-2008 and is now deprecated. Other C libraries +may interpret e.g.@: @samp{%as} as the @samp{%a} format for reading +floating-point numbers, followed by a literal @samp{s}. zw
On Fri, 9 Feb 2018, Zack Weinberg wrote: > > +As a GNU extension predating @samp{m}, @samp{a} is also available, but > > +its use is considered deprecated. > > let's be a little more specific here: > > +As a GNU extension, the modifier @samp{a} has the same effect as @samp{m}. > +This extension predates POSIX.1-2008 and is now deprecated. Other C libraries > +may interpret e.g.@: @samp{%as} as the @samp{%a} format for reading > +floating-point numbers, followed by a literal @samp{s}. Which glibc does in the absence of _GNU_SOURCE, since __USE_XOPEN2K is defined by default. The redirection to __isoc99_scanf etc. is done if: #if defined __USE_ISOC99 && !defined __USE_GNU \ && (!defined __LDBL_COMPAT || !defined __REDIRECT) \ && (defined __STRICT_ANSI__ || defined __USE_XOPEN2K)
On Fri, Feb 9, 2018 at 11:39 AM, Joseph Myers <joseph@codesourcery.com> wrote: > On Fri, 9 Feb 2018, Zack Weinberg wrote: > >> > +As a GNU extension predating @samp{m}, @samp{a} is also available, but >> > +its use is considered deprecated. >> >> let's be a little more specific here: >> >> +As a GNU extension, the modifier @samp{a} has the same effect as @samp{m}. >> +This extension predates POSIX.1-2008 and is now deprecated. Other C libraries >> +may interpret e.g.@: @samp{%as} as the @samp{%a} format for reading >> +floating-point numbers, followed by a literal @samp{s}. > > Which glibc does in the absence of _GNU_SOURCE, since __USE_XOPEN2K is > defined by default. The redirection to __isoc99_scanf etc. is done if: > > #if defined __USE_ISOC99 && !defined __USE_GNU \ > && (!defined __LDBL_COMPAT || !defined __REDIRECT) \ > && (defined __STRICT_ANSI__ || defined __USE_XOPEN2K) I'm tempted to suggest that we drop the __USE_GNU - meaning that 'a' would only be a modifier under -std=gnu89, if I'm reading that correctly - both because it'll be easier to document, and because this seems to be already what GCC's scanf format warnings do: $ cat test.c #include <stdio.h> int main(void) { char *s; scanf("%as", &s); puts(s); return 0; } $ gcc -std=gnu89 -Wformat test.c $ gcc -std=gnu11 -Wformat test.c test.c: In function ‘main’: test.c:6:11: warning: format ‘%a’ expects argument of type ‘float *’, but argument 2 has type ‘char **’ [-Wformat=] scanf("%as", &s); ~^ ~~ $ gcc -std=gnu11 -Wformat -D_GNU_SOURCE test.c test.c: In function ‘main’: test.c:6:11: warning: format ‘%a’ expects argument of type ‘float *’, but argument 2 has type ‘char **’ [-Wformat=] scanf("%as", &s); ~^ ~~ $ gcc --version gcc (Debian 7.3.0-3) 7.3.0
On Fri, 9 Feb 2018, Zack Weinberg wrote: > >> +As a GNU extension, the modifier @samp{a} has the same effect as @samp{m}. > >> +This extension predates POSIX.1-2008 and is now deprecated. Other C libraries > >> +may interpret e.g.@: @samp{%as} as the @samp{%a} format for reading > >> +floating-point numbers, followed by a literal @samp{s}. > > > > Which glibc does in the absence of _GNU_SOURCE, since __USE_XOPEN2K is > > defined by default. The redirection to __isoc99_scanf etc. is done if: > > > > #if defined __USE_ISOC99 && !defined __USE_GNU \ > > && (!defined __LDBL_COMPAT || !defined __REDIRECT) \ > > && (defined __STRICT_ANSI__ || defined __USE_XOPEN2K) > > I'm tempted to suggest that we drop the __USE_GNU - meaning that 'a' > would only be a modifier under -std=gnu89, if I'm reading that That seems reasonable to me. (With a corresponding change to bits/stdio-ldbl.h to keep things consistent in the -mlong-double-64 case.)
diff --git a/manual/stdio.texi b/manual/stdio.texi index 38be236991..22c338f8ea 100644 --- a/manual/stdio.texi +++ b/manual/stdio.texi @@ -3440,9 +3440,8 @@ successful assignments. @cindex flag character (@code{scanf}) @item -An optional flag character @samp{a} (valid with string conversions only) +An optional flag character @samp{m} (valid with string conversions only) which requests allocation of a buffer long enough to store the string in. -(This is a GNU extension.) @xref{Dynamic String Input}. @item @@ -3720,8 +3719,8 @@ provide the buffer, always specify a maximum field width to prevent overflow.} @item -Ask @code{scanf} to allocate a big enough buffer, by specifying the -@samp{a} flag character. This is a GNU extension. You should provide +Ask @code{scanf} to allocate a big-enough buffer, by specifying the +@samp{m} flag character. You should provide an argument of type @code{char **} for the buffer address to be stored in. @xref{Dynamic String Input}. @end itemize @@ -3825,7 +3824,7 @@ is said about @samp{%ls} above is true for @samp{%l[}. One more reminder: the @samp{%s} and @samp{%[} conversions are @strong{dangerous} if you don't specify a maximum width or use the -@samp{a} flag, because input too long would overflow whatever buffer you +@samp{m} flag, because too-long input would overflow whatever buffer you have provided for it. No matter how long your buffer is, a user could supply input that is longer. A well-written program reports invalid input with a comprehensible error message, not with a crash. @@ -3833,18 +3832,27 @@ input with a comprehensible error message, not with a crash. @node Dynamic String Input @subsection Dynamically Allocating String Conversions -A GNU extension to formatted input lets you safely read a string with no +POSIX.1-2008 specifies an @dfn{assignment-allocation character} +@samp{m}, valid for use with the string conversion specifiers +@samp{s}, @samp{S}, @samp{[}, @samp{c}, and @samp{C}, which +lets you safely read a string with no maximum size. Using this feature, you don't supply a buffer; instead, @code{scanf} allocates a buffer big enough to hold the data and gives -you its address. To use this feature, write @samp{a} as a flag -character, as in @samp{%as} or @samp{%a[0-9a-z]}. +you its address. To use this feature, write @samp{m} as a flag +character; e.g., @samp{%ms} or @samp{%m[0-9a-z]}. The pointer argument you supply for where to store the input should have type @code{char **}. The @code{scanf} function allocates a buffer and -stores its address in the word that the argument points to. You should -free the buffer with @code{free} when you no longer need it. +stores its address in the word that the argument points to. When +using the @samp{l} modifier (or equivalently, @samp{S} or @samp{C}), +the pointer argument should have the type @code{wchar_t **}. + +You should free the buffer with @code{free} when you no longer need it. + +As a GNU extension predating @samp{m}, @samp{a} is also available, but +its use is considered deprecated. -Here is an example of using the @samp{a} flag with the @samp{%[@dots{}]} +Here is an example of using the @samp{m} flag with the @samp{%[@dots{}]} conversion specification to read a ``variable assignment'' of the form @samp{@var{variable} = @var{value}}. @@ -3852,7 +3860,7 @@ conversion specification to read a ``variable assignment'' of the form @{ char *variable, *value; - if (2 > scanf ("%a[a-zA-Z0-9] = %a[^\n]\n", + if (2 > scanf ("%m[a-zA-Z0-9] = %m[^\n]\n", &variable, &value)) @{ invalid_input_error ();