Patchwork Updated: [Patch, c* ,ObjC*] handle string objects in format checking.

login
register
mail settings
Submitter Nicola Pero
Date Nov. 1, 2010, 4:37 a.m.
Message ID <1288586275.67821555@192.168.2.230>
Download mbox | patch
Permalink /patch/69750/
State New
Headers show

Comments

Nicola Pero - Nov. 1, 2010, 4:37 a.m.
>> The actual scanning of the format strings is now local to ObjC
>> (there's a stub routine added by the patch) and additions to carry
>> that out would fall under whatever rules Mike wishes to invoke.
>
>> Wouldn't we reuse the existing routines and simply add %@ ?
>
> Hm.
> I think to double-check the doc.
> IIRC there were some subtle differences from c. (on NeXT, at least)...

As far as I know, there are no intended differences other than the addition
of %@ (in other words, my description of the NSString format in my other
email should be correct and complete).

If you need to see Apple documentation, here it is --

http://developer.apple.com/library/mac/#documentation/Cocoa/Conceptual/Strings/Articles/FormatStrings.html#//apple_ref/doc/uid/20000943-CJBEAEHH

It says 

"NSString uses a format string whose syntax is similar to that used by other formatter objects. 
It supports the format characters defined for the ANSI C function printf(), plus %@ for any object 
(see “String Format Specifiers” and the IEEE printf specification)."

In attach is a draft patch I had produced during the weekend - it works for me with the GNU runtime on top
of my other patches (but it doesn't include documentation (and not enough comments), has a few hacks and
rough edges, most likelyl won't work on Darwin and it doesn't do Objective-C++ yet). ;-)

Feel free to clean it up, steal the bits that you need, drop the ones that you don't need, make it work with Darwin 
as well as the GNU runtime and generally merge it with yours so we get full NSString format in 4.6.0. :-)

I hope that helps.  I'll now take a few days off the compiler, then come back in a few days to start 
the bug fixing round. :-)

Thanks
Binary files trunk9/gcc/testsuite/objc.dg/attributes/method-format-2.o and trunk8/gcc/testsuite/objc.dg/attributes/method-format-2.o differ
Paolo Bonzini - Nov. 1, 2010, 8:09 a.m.
>  2010-10-31  Nicola Pero  <nicola.pero@meta-innovation.com>
>
> +	* c-common.h (objc_is_string_type): New.
> +	(objc_maybe_extract_string_cst_from_objc_string): New.
> +	* stub-objc.c (objc_is_string_type): New.
> +	(objc_maybe_extract_string_cst_from_objc_string): New.	
> +	* c-format.c (is_string_type): New.
> +	(handle_format_arg_attribute): Updated call to
> +	check_format_string.  Use is_string_type.
> +	(check_format_string): Added objc_string argument.  If set, check
> +	that the format string is an Objective-C string.
> +	(handle_format_attribute): For nsstring_format_type, check that
> +	format strings are Objective-C strings.
> +	(struct format_wanted_type): Added objc_object_flag.
> +	(nsstring_length_specs): New.
> +	(nsstring_flag_pairs): New.
> +	(nsstring_flag_specs): New.
> +	(nsstring_char_table): New.
> +	(check_format_arg): Implemented nsstring_format_type.
> +	(check_format_info_main): Implemented '@' as valid flag2.
> +	(check_format_types): Implemented objc_object_flag.
> +	* c-format.h: Updated comments.
> +	(T89_AT): New.	

Just a heads-up: my patch at 
http://gcc.gnu.org/ml/gcc-patches/2010-10/msg02437.html conflicts with 
this a bit, in that it changes a few messages in the attribute((format)) 
output.  So the tests would have to be adjusted.

Paolo
IainS - Nov. 1, 2010, 8:18 a.m.
On 1 Nov 2010, at 08:09, Paolo Bonzini wrote:

>> 2010-10-31  Nicola Pero  <nicola.pero@meta-innovation.com>
>>
>> +	* c-common.h (objc_is_string_type): New.
>> +	(objc_maybe_extract_string_cst_from_objc_string): New.
>> +	* stub-objc.c (objc_is_string_type): New.
>> +	(objc_maybe_extract_string_cst_from_objc_string): New.	
>> +	* c-format.c (is_string_type): New.
>> +	(handle_format_arg_attribute): Updated call to
>> +	check_format_string.  Use is_string_type.
>> +	(check_format_string): Added objc_string argument.  If set, check
>> +	that the format string is an Objective-C string.
>> +	(handle_format_attribute): For nsstring_format_type, check that
>> +	format strings are Objective-C strings.
>> +	(struct format_wanted_type): Added objc_object_flag.
>> +	(nsstring_length_specs): New.
>> +	(nsstring_flag_pairs): New.
>> +	(nsstring_flag_specs): New.
>> +	(nsstring_char_table): New.
>> +	(check_format_arg): Implemented nsstring_format_type.
>> +	(check_format_info_main): Implemented '@' as valid flag2.
>> +	(check_format_types): Implemented objc_object_flag.
>> +	* c-format.h: Updated comments.
>> +	(T89_AT): New.	
>
> Just a heads-up: my patch at http://gcc.gnu.org/ml/gcc-patches/2010-10/msg02437.html 
>  conflicts with this a bit, in that it changes a few messages in the  
> attribute((format)) output.  So the tests would have to be adjusted.


I saw your patch and figured that whichever went in first there might  
be some tweaking of things ...
... I'm happy to adjust tests in ObjC*.

Apropos Nicola's additions,  I wonder if we really need to keep the  
handling of ObjC strings in ObJC.
Can we guarantee always to be able to fetch an equivalent string that  
is parse-able by the printf machinery?

(just thinking aloud).
Iain
IainS - Nov. 1, 2010, 8:49 a.m.
On 1 Nov 2010, at 04:37, Nicola Pero wrote:

>
>>> The actual scanning of the format strings is now local to ObjC
>>> (there's a stub routine added by the patch) and additions to carry
>>> that out would fall under whatever rules Mike wishes to invoke.
>>
>>> Wouldn't we reuse the existing routines and simply add %@ ?
>>
>> Hm.
>> I think to double-check the doc.
>> IIRC there were some subtle differences from c. (on NeXT, at  
>> least)...

> "NSString uses a format string whose syntax is similar to that used  
> by other formatter objects.
> It supports the format characters defined for the ANSI C function  
> printf(), plus %@ for any object
> (see “String Format Specifiers” and the IEEE printf specification)."
>
> In attach is a draft patch I had produced during the weekend - it  
> works for me with the GNU runtime on top
> of my other patches (but it doesn't include documentation (and not  
> enough comments), has a few hacks and
> rough edges, most likelyl won't work on Darwin and it doesn't do  
> Objective-C++ yet). ;-)

May I suggest a possible amalgamation sth like this:

(1) the patch I proposed (which handles the other necessary cases for  
NeXT and Darwin as well).

(2) merge your adjustments to the NSString format declaration.

       add a call to objc that determines if an NSString can be  
represented in a manner parse-able in c-format.c

       if so then it can continue to parse the literal as per the  
existing code (with the addition of the %@)
       otherwise parse in ObjC.

This is because the string literal behind an NS (or CF) string need  
not be ascii or even utf8 -- it could be utf16 (although we haven't  
implemented that yet).

I'm not sure that we shouldn't just keep the whole thing in ObjC (and  
add an objc-format.c file to deal with the future expansion).

In any event, have you any objection to first applying the syntax and  
parsing logic - assuming Joseph is now happy with it (since that patch  
is already reviewed twice) and then filling in the implementation?

cheers,
Iain
Nicola Pero - Nov. 1, 2010, 9:27 a.m.
> This is because the string literal behind an NS (or CF) string need  
> not be ascii or even utf8 -- it could be utf16 (although we haven't  
> implemented that yet).

Hmmm.  Excellent point by the way. ;-)

I still think we should do the NSString formatting in c-format.c.  I don't 
want to rewrite and maintain the whole of c-format.c in gcc/objc/. ;-)

[and in your patch you have another placeholder for CFString formatting in Darwin ... I
don't think we'd want to duplicate the entire printf formatting yet another time for 
Darwin just to allow CFStringRefs to be used instead of NSStrings ;-).]

We can probably find some sort of solution even for unicode strings.  
Probably ObjC could return the constant string converted to UTF-8 so that 
it can be fed into c-format.c without particular any changes to c-format.c ?


> In any event, have you any objection to first applying the syntax and  
> parsing logic - assuming Joseph is now happy with it (since that patch  
> is already reviewed twice) and then filling in the implementation?

No, I have no objection to doing it in two phases.  But it would make sense
to consider the second step when doing the first one. :-)

Thanks
IainS - Nov. 1, 2010, 9:41 a.m.
On 1 Nov 2010, at 09:27, Nicola Pero wrote:

>
>> This is because the string literal behind an NS (or CF) string need
>> not be ascii or even utf8 -- it could be utf16 (although we haven't
>> implemented that yet).
>
> Hmmm.  Excellent point by the way. ;-)
>
> I still think we should do the NSString formatting in c-format.c.  I  
> don't
> want to rewrite and maintain the whole of c-format.c in gcc/objc/. ;-)
>
> [and in your patch you have another placeholder for CFString  
> formatting in Darwin ... I
> don't think we'd want to duplicate the entire printf formatting yet  
> another time for
> Darwin just to allow CFStringRefs to be used instead of  
> NSStrings ;-).]

The only addition that could be made to distinguish CFStrings is that,  
in principle, (unimplemented at present) they could be backed by  
pascal style strings.

I agree entirely with the undesirability of maintaining unnecessary  
code.
(In my mind I had the possibility of calling back to c-format.c for  
'normal' string cases)

> We can probably find some sort of solution even for unicode strings.
> Probably ObjC could return the constant string converted to UTF-8 so  
> that
> it can be fed into c-format.c without particular any changes to c- 
> format.c ?

well, I think that would simply involve the two call-outs (perhaps  
with changed names) agreeing to return a utf-8 string.
this involves some @compile-time string conversion - but should be  
feasible (maybe libcpp already has the necessary).

if it's impossible to return utf8 for some reason then the fall-back  
is either to warn or to treat the string as non-literal (silently).

>> In any event, have you any objection to first applying the syntax and
>> parsing logic - assuming Joseph is now happy with it (since that  
>> patch
>> is already reviewed twice) and then filling in the implementation?
>
> No, I have no objection to doing it in two phases.  But it would  
> make sense
> to consider the second step when doing the first one. :-)

of course, I had slightly different plans - but we will converge I'm  
sure.

cheers
Iain
Joseph S. Myers - Nov. 1, 2010, 1:13 p.m.
On Mon, 1 Nov 2010, Nicola Pero wrote:

> Probably ObjC could return the constant string converted to UTF-8 so that 
> it can be fed into c-format.c without particular any changes to c-format.c ?

What's really wanted - to do wide string format checking for wprintf etc. 
(bug 38308), not just for ObjC - are separate accessors / iterators that 
in the narrow string case extract a byte, or a multibyte character, from a 
string.  (C format strings are complicated mixtures of bytes and multibyte 
characters.)  In the wide string case both would just end up extracting a 
single wide character.  It would be strongly desirable to have a design 
for string access that is not specific to format checking but can be used 
everywhere else that looks at contents of strings.  (All such 
optimizations - folding, built-in functions, etc. - ought to work equally 
on narrow and wide strings; I don't think they do at present.)

I will no longer ask that such changes actually define the formats for 
strings for non-8-bit-bytes as I did previously, and rather too 
forcefully, in response to 
<http://gcc.gnu.org/ml/gcc-patches/2001-12/msg01579.html>.  As long as you 
set up central infrastructure for iterating over strings it will be 
significantly easier for anyone adding such a target in future to define 
things in one place and fix other places to use the common infrastructure.  
Nor will I ask that the character set issues (bugs 20110 and 33748) 
necessarily be fixed at the same time.
IainS - Nov. 1, 2010, 2:29 p.m.
On 1 Nov 2010, at 13:13, Joseph S. Myers wrote:

> On Mon, 1 Nov 2010, Nicola Pero wrote:
>
>> Probably ObjC could return the constant string converted to UTF-8  
>> so that
>> it can be fed into c-format.c without particular any changes to c- 
>> format.c ?
>
> What's really wanted - to do wide string format checking for wprintf  
> etc.
> (bug 38308), not just for ObjC - are separate accessors / iterators  
> that
> in the narrow string case extract a byte, or a multibyte character,  
> from a
> string.  (C format strings are complicated mixtures of bytes and  
> multibyte
> characters.)  In the wide string case both would just end up  
> extracting a
> single wide character.  It would be strongly desirable to have a  
> design
> for string access that is not specific to format checking but can be  
> used
> everywhere else that looks at contents of strings.  (All such
> optimizations - folding, built-in functions, etc. - ought to work  
> equally
> on narrow and wide strings; I don't think they do at present.)
>
> I will no longer ask that such changes actually define the formats for
> strings for non-8-bit-bytes as I did previously, and rather too
> forcefully, in response to
> <http://gcc.gnu.org/ml/gcc-patches/2001-12/msg01579.html>.  As long  
> as you
> set up central infrastructure for iterating over strings it will be
> significantly easier for anyone adding such a target in future to  
> define
> things in one place and fix other places to use the common  
> infrastructure.
> Nor will I ask that the character set issues (bugs 20110 and 33748)
> necessarily be fixed at the same time.

This looks like something beyond 4.6?

Can I ask what the position is on the original patch which sets out to  
check syntax and deal with two string formats which are, essentially,  
outside our control?

[although the basic layout and capability of CFString might well  
satisfy the objectives for a common string container above, and is  
part of the FSF code-base]

thanks,
Iain
Joseph S. Myers - Nov. 1, 2010, 2:47 p.m.
On Mon, 1 Nov 2010, IainS wrote:

> This looks like something beyond 4.6?

Yes.

> Can I ask what the position is on the original patch which sets out to check
> syntax and deal with two string formats which are, essentially, outside our
> control?

I have a queue of patches to review.  Two format checking patches are on 
it.
Nicola Pero - Nov. 1, 2010, 7 p.m.
>> Probably ObjC could return the constant string converted to UTF-8 so that 
>> it can be fed into c-format.c without particular any changes to c-format.c ?
>
> What's really wanted - to do wide string format checking for wprintf etc. 
> (bug 38308), not just for ObjC - are separate accessors / iterators that [...]

Joseph,

thanks for explaining.  That is very interesting.  Would be worth working on it.

At the moment, I guess a first step would be to get NSString format string
parsing with ASCII Objective-C constant strings (ie, the same as printf). :-)

If the constant string is not ASCII, we could simply skip the check for now - and
come back to it in 4.7.0.  And I agree with you that we then want the general solution,
not just a quick hack. ;-)

Thanks

Patch

diff -rupN --exclude=.svn trunk9/gcc/c-family/c-common.h trunk8/gcc/c-family/c-common.h
--- trunk9/gcc/c-family/c-common.h	2010-10-30 12:26:13.000000000 +0100
+++ trunk8/gcc/c-family/c-common.h	2010-11-01 02:54:43.000000000 +0000
@@ -977,6 +977,7 @@  extern void c_parse_error (const char *,
 extern void objc_write_global_declarations (void);
 extern tree objc_is_class_name (tree);
 extern tree objc_is_object_ptr (tree);
+extern bool objc_is_string_type (tree);
 extern void objc_check_decl (tree);
 extern void objc_check_global_decl (tree);
 extern tree objc_common_type (tree, tree);
@@ -1001,6 +1002,7 @@  extern tree objc_build_selector_expr (lo
 extern tree objc_build_protocol_expr (tree);
 extern tree objc_build_encode_expr (tree);
 extern tree objc_build_string_object (tree);
+extern tree objc_maybe_extract_string_cst_from_objc_string (tree);
 extern tree objc_get_protocol_qualified_type (tree, tree);
 extern tree objc_get_class_reference (tree);
 extern tree objc_get_class_ivars (tree);
diff -rupN --exclude=.svn trunk9/gcc/c-family/c-format.c trunk8/gcc/c-family/c-format.c
--- trunk9/gcc/c-family/c-format.c	2010-10-31 16:14:36.000000000 +0000
+++ trunk8/gcc/c-family/c-format.c	2010-11-01 04:15:59.000000000 +0000
@@ -63,6 +63,7 @@  enum format_type { printf_format_type, a
 		   gcc_diag_format_type, gcc_tdiag_format_type,
 		   gcc_cdiag_format_type,
 		   gcc_cxxdiag_format_type, gcc_gfc_format_type,
+		   nsstring_format_type,
 		   format_type_error = -1};
 
 typedef struct function_format_info
@@ -77,12 +78,20 @@  static int decode_format_type (const cha
 
 static bool check_format_string (tree argument,
 				 unsigned HOST_WIDE_INT format_num,
-				 int flags, bool *no_add_attrs);
+				 int flags, bool *no_add_attrs, bool objc_string);
 static bool get_constant (tree expr, unsigned HOST_WIDE_INT *value,
 			  int validated_p);
 static const char *convert_format_name_to_system_name (const char *attr_name);
 static bool cmp_attribs (const char *tattr_name, const char *attr_name);
 
+/* Check that 'type' is a C string type.  */
+static bool
+is_string_type (tree type)
+{
+  return (TREE_CODE (type) == POINTER_TYPE
+	  && (TYPE_MAIN_VARIANT (TREE_TYPE (type)) == char_type_node));
+}
+
 /* Handle a "format_arg" attribute; arguments as in
    struct attribute_spec.handler.  */
 tree
@@ -101,16 +110,16 @@  handle_format_arg_attribute (tree *node,
       return NULL_TREE;
     }
 
+  /* TODO: Support for nsstring_format_type.  */
+
   argument = TYPE_ARG_TYPES (type);
   if (argument)
     {
-      if (!check_format_string (argument, format_num, flags, no_add_attrs))
+      if (!check_format_string (argument, format_num, flags, no_add_attrs, false))
 	return NULL_TREE;
     }
 
-  if (TREE_CODE (TREE_TYPE (type)) != POINTER_TYPE
-      || (TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (type)))
-	  != char_type_node))
+  if (!is_string_type (TREE_TYPE (type)))
     {
       if (!(flags & (int) ATTR_FLAG_BUILT_IN))
 	error ("function does not return string type");
@@ -125,7 +134,7 @@  handle_format_arg_attribute (tree *node,
    the format attribute is in error.  */
 static bool
 check_format_string (tree argument, unsigned HOST_WIDE_INT format_num,
-		     int flags, bool *no_add_attrs)
+		     int flags, bool *no_add_attrs, bool objc_string)
 {
   unsigned HOST_WIDE_INT i;
 
@@ -136,15 +145,25 @@  check_format_string (tree argument, unsi
       argument = TREE_CHAIN (argument);
     }
 
-  if (!argument
-      || TREE_CODE (TREE_VALUE (argument)) != POINTER_TYPE
-      || (TYPE_MAIN_VARIANT (TREE_TYPE (TREE_VALUE (argument)))
-	  != char_type_node))
+  if (!objc_string)
     {
-      if (!(flags & (int) ATTR_FLAG_BUILT_IN))
-	error ("format string argument not a string type");
-      *no_add_attrs = true;
-      return false;
+      if (!argument  ||  !is_string_type (TREE_VALUE (argument)))
+	{
+	  if (!(flags & (int) ATTR_FLAG_BUILT_IN))
+	    error ("format string argument not a string type");
+	  *no_add_attrs = true;
+	  return false;
+	}
+    }
+  else
+    {
+      if (!argument  ||  !objc_is_string_type (TREE_VALUE (argument)))
+	{
+	  if (!(flags & (int) ATTR_FLAG_BUILT_IN))
+	    error ("format string argument not an Objective-C string type");
+	  *no_add_attrs = true;
+	  return false;
+	}
     }
 
   return true;
@@ -273,6 +292,9 @@  typedef struct format_wanted_type
   /* Whether the argument, dereferenced once, is read from and so
      must not be a NULL pointer.  */
   int reading_from_flag;
+  /* Whether the argument must be an Objective-C object pointer type
+     and we need to use objc_is_object_ptr() to check it.  */
+  int objc_object_flag;
   /* If warnings should be of the form "field precision should have
      type 'int'", the name to use (in this case "field precision"),
      otherwise NULL, for "format expects type 'long'" type
@@ -358,6 +380,8 @@  static const format_length_info gcc_gfc_
   { NO_FMT, NO_FMT, 0 }
 };
 
+/* 'NSString' format is identical to printf for these.  */
+#define nsstring_length_specs printf_length_specs
 
 static const format_flag_spec printf_flag_specs[] =
 {
@@ -418,6 +442,8 @@  static const format_flag_pair gcc_gfc_fl
   { 0, 0, 0, 0 }
 };
 
+#define nsstring_flag_pairs printf_flag_pairs
+
 static const format_flag_spec gcc_diag_flag_specs[] =
 {
   { '+',  0, 0, N_("'+' flag"),        N_("the '+' printf flag"),              STD_C89 },
@@ -432,6 +458,8 @@  static const format_flag_spec gcc_diag_f
 #define gcc_cdiag_flag_specs gcc_diag_flag_specs
 #define gcc_cxxdiag_flag_specs gcc_diag_flag_specs
 
+#define nsstring_flag_specs printf_flag_specs
+
 static const format_flag_spec scanf_flag_specs[] =
 {
   { '*',  0, 0, N_("assignment suppression"), N_("the assignment suppression scanf feature"), STD_C89 },
@@ -651,6 +679,32 @@  static const format_char_info gcc_gfc_ch
   { NULL,  0, STD_C89, NOLENGTHS, NULL, NULL, NULL }
 };
 
+/* This is identical to the printf one, except for the addition of %@, used for Objective-C objects.  */
+static const format_char_info nsstring_char_table[] =
+{
+  /* C89 conversion specifiers.  */
+  { "di",  0, STD_C89, { T89_I,   T99_SC,  T89_S,   T89_L,   T9L_LL,  TEX_LL,  T99_SST, T99_PD,  T99_IM,  BADLEN,  BADLEN,  BADLEN  }, "-wp0 +'I",  "i",  NULL },
+  { "oxX", 0, STD_C89, { T89_UI,  T99_UC,  T89_US,  T89_UL,  T9L_ULL, TEX_ULL, T99_ST,  T99_UPD, T99_UIM, BADLEN,  BADLEN,  BADLEN }, "-wp0#",     "i",  NULL },
+  { "u",   0, STD_C89, { T89_UI,  T99_UC,  T89_US,  T89_UL,  T9L_ULL, TEX_ULL, T99_ST,  T99_UPD, T99_UIM, BADLEN,  BADLEN,  BADLEN }, "-wp0'I",    "i",  NULL },
+  { "fgG", 0, STD_C89, { T89_D,   BADLEN,  BADLEN,  T99_D,   BADLEN,  T89_LD,  BADLEN,  BADLEN,  BADLEN,  TEX_D32, TEX_D64, TEX_D128 }, "-wp0 +#'I", "",   NULL },
+  { "eE",  0, STD_C89, { T89_D,   BADLEN,  BADLEN,  T99_D,   BADLEN,  T89_LD,  BADLEN,  BADLEN,  BADLEN,  TEX_D32, TEX_D64, TEX_D128 }, "-wp0 +#I",  "",   NULL },
+  { "c",   0, STD_C89, { T89_I,   BADLEN,  BADLEN,  T94_WI,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-w",        "",   NULL },
+  { "s",   1, STD_C89, { T89_C,   BADLEN,  BADLEN,  T94_W,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-wp",       "cR", NULL },
+  { "p",   1, STD_C89, { T89_V,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-w",        "c",  NULL },
+  { "n",   1, STD_C89, { T89_I,   T99_SC,  T89_S,   T89_L,   T9L_LL,  BADLEN,  T99_SST, T99_PD,  T99_IM,  BADLEN,  BADLEN,  BADLEN }, "",          "W",  NULL },
+  /* C99 conversion specifiers.  */
+  { "F",   0, STD_C99, { T99_D,   BADLEN,  BADLEN,  T99_D,   BADLEN,  T99_LD,  BADLEN,  BADLEN,  BADLEN,  TEX_D32, TEX_D64, TEX_D128 }, "-wp0 +#'I", "",   NULL },
+  { "aA",  0, STD_C99, { T99_D,   BADLEN,  BADLEN,  T99_D,   BADLEN,  T99_LD,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-wp0 +#",   "",   NULL },
+  /* X/Open conversion specifiers.  */
+  { "C",   0, STD_EXT, { TEX_WI,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-w",        "",   NULL },
+  { "S",   1, STD_EXT, { TEX_W,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-wp",       "R",  NULL },
+  /* GNU conversion specifiers.  */
+  { "m",   0, STD_EXT, { T89_V,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-wp",       "",   NULL },
+  /* Objective-C conversion specfier.  */
+  { "@",   1, STD_C89, { T89_AT,  BADLEN,  BADLEN,  BADLEN,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-wp",       "@", NULL },
+  { NULL,  0, STD_C89, NOLENGTHS, NULL, NULL, NULL }
+};
+
 static const format_char_info scan_char_table[] =
 {
   /* C89 conversion specifiers.  */
@@ -750,6 +804,12 @@  static const format_kind_info format_typ
     0, 0, 0, 0, 0, 0,
     NULL, NULL
   },
+  { "NSString", nsstring_length_specs, nsstring_char_table, "+#0-I", NULL,
+    nsstring_flag_specs, nsstring_flag_pairs,
+    FMT_FLAG_ARG_CONVERT|FMT_FLAG_DOLLAR_MULTIPLE|FMT_FLAG_USE_DOLLAR|FMT_FLAG_EMPTY_PREC_OK,
+    'w', 0, 'p', 0, 'L', 0,
+    &integer_type_node, &integer_type_node
+  },
   { "gnu_scanf",    scanf_length_specs,   scan_char_table,  "*'I", NULL,
     scanf_flag_specs, scanf_flag_pairs,
     FMT_FLAG_ARG_CONVERT|FMT_FLAG_SCANF_A_KLUDGE|FMT_FLAG_USE_DOLLAR|FMT_FLAG_ZERO_WIDTH_BAD|FMT_FLAG_DOLLAR_GAP_POINTER_OK,
@@ -1364,16 +1424,31 @@  check_format_arg (void *ctx, tree format
       array_size = DECL_SIZE_UNIT (format_tree);
       format_tree = array_init;
     }
+
+  /* nsstring_format_type is special because the format string is an
+     Objective-C string.  See if we can extract the constant
+     string.  */
+  if (info->format_type == nsstring_format_type)
+    format_tree = objc_maybe_extract_string_cst_from_objc_string (format_tree);
+
   if (TREE_CODE (format_tree) != STRING_CST)
     {
       res->number_non_literal++;
       return;
     }
-  if (TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (format_tree))) != char_type_node)
+  /* FIXME: The STRING_CST associated with Objective-C constant
+     strings is most definitely a traditional ASCII string, yet it
+     seems not to be marked with a char_type_node.  Until that is
+     fixed, do not perform this test for Objective-C strings.  */
+  if (info->format_type != nsstring_format_type)
     {
-      res->number_wide++;
-      return;
+      if (TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (format_tree))) != char_type_node)
+	{
+	  res->number_wide++;
+	  return;
+	}
     }
+
   format_chars = TREE_STRING_POINTER (format_tree);
   format_length = TREE_STRING_LENGTH (format_tree);
   if (array_size != 0)
@@ -2081,6 +2156,8 @@  check_format_info_main (format_check_res
 	      wanted_type_ptr->char_lenient_flag = 0;
 	      if (strchr (fci->flags2, 'c') != 0)
 		wanted_type_ptr->char_lenient_flag = 1;
+	      if (strchr (fci->flags2, '@') != 0)
+		wanted_type_ptr->objc_object_flag = 1;
 	      wanted_type_ptr->scalar_identity_flag = 0;
 	      if (scalar_identity_flag)
 		wanted_type_ptr->scalar_identity_flag = 1;
@@ -2241,6 +2318,10 @@  check_format_types (format_wanted_type *
 			  || cur_type == signed_char_type_node
 			  || cur_type == unsigned_char_type_node);
 
+      /* Check for Objective-C objects.  */
+      if (types->objc_object_flag  &&  objc_is_object_ptr (cur_type))
+	continue;
+
       /* Check the type of the "real" argument, if there's a type we want.  */
       if (lang_hooks.types_compatible_p (wanted_type, cur_type))
 	continue;
@@ -2798,8 +2879,16 @@  handle_format_attribute (tree *node, tre
   argument = TYPE_ARG_TYPES (type);
   if (argument)
     {
+      bool objc_string = false;
+
+      /* In Objective-C, the special NSString format type is allowed,
+	 where the format string is an Objective-C string instead of
+	 a C string.  */
+      if (c_dialect_objc ()  &&  info.format_type == nsstring_format_type)
+	objc_string = true;
+
       if (!check_format_string (argument, info.format_num, flags,
-				no_add_attrs))
+				no_add_attrs, objc_string))
 	return NULL_TREE;
 
       if (info.first_arg_num != 0)
diff -rupN --exclude=.svn trunk9/gcc/c-family/c-format.h trunk8/gcc/c-family/c-format.h
--- trunk9/gcc/c-family/c-format.h	2010-09-23 23:39:18.000000000 +0100
+++ trunk8/gcc/c-family/c-format.h	2010-11-01 04:17:07.000000000 +0000
@@ -142,14 +142,16 @@  typedef struct format_char_info
      modifiers.  */
   const char *flag_chars;
   /* List of additional flags describing these conversion specifiers.
-     "c" for generic character pointers being allowed, "2" for strftime
-     two digit year formats, "3" for strftime formats giving two digit
-     years in some locales, "4" for "2" which becomes "3" with an "E" modifier,
-     "o" if use of strftime "O" is a GNU extension beyond C99,
-     "W" if the argument is a pointer which is dereferenced and written into,
-     "R" if the argument is a pointer which is dereferenced and read from,
-     "i" for printf integer formats where the '0' flag is ignored with
-     precision, and "[" for the starting character of a scanf scanset.  */
+     "c" for generic character pointers being allowed, "2" for
+     strftime two digit year formats, "3" for strftime formats giving
+     two digit years in some locales, "4" for "2" which becomes "3"
+     with an "E" modifier, "o" if use of strftime "O" is a GNU
+     extension beyond C99, "W" if the argument is a pointer which is
+     dereferenced and written into, "R" if the argument is a pointer
+     which is dereferenced and read from, "i" for printf integer
+     formats where the '0' flag is ignored with precision, "[" for the
+     starting character of a scanf scanset and "@" for Objective-C
+     objects being allowed.  */
   const char *flags2;
   /* If this format conversion character consumes more than one argument,
      CHAIN points to information about the next argument.  For later
@@ -286,6 +288,9 @@  typedef struct
 #define T99_UC	{ STD_C99, NULL, T_UC }
 #define T_V	&void_type_node
 #define T89_V	{ STD_C89, NULL, T_V }
+/* TODO: In the following, it would be better to print 'id' but we
+   need to avoid the '*' for the pointer.  */
+#define T89_AT	{ STD_C89, "struct object", T_V }
 #define T_W	&wchar_type_node
 #define T94_W	{ STD_C94, "wchar_t", T_W }
 #define TEX_W	{ STD_EXT, "wchar_t", T_W }
diff -rupN --exclude=.svn trunk9/gcc/c-family/ChangeLog trunk8/gcc/c-family/ChangeLog
--- trunk9/gcc/c-family/ChangeLog	2010-10-31 17:36:18.000000000 +0000
+++ trunk8/gcc/c-family/ChangeLog	2010-11-01 04:22:58.000000000 +0000
@@ -1,5 +1,29 @@ 
 2010-10-31  Nicola Pero  <nicola.pero@meta-innovation.com>
 
+	* c-common.h (objc_is_string_type): New.
+	(objc_maybe_extract_string_cst_from_objc_string): New.
+	* stub-objc.c (objc_is_string_type): New.
+	(objc_maybe_extract_string_cst_from_objc_string): New.	
+	* c-format.c (is_string_type): New.
+	(handle_format_arg_attribute): Updated call to
+	check_format_string.  Use is_string_type.
+	(check_format_string): Added objc_string argument.  If set, check
+	that the format string is an Objective-C string.
+	(handle_format_attribute): For nsstring_format_type, check that
+	format strings are Objective-C strings.
+	(struct format_wanted_type): Added objc_object_flag.
+	(nsstring_length_specs): New.
+	(nsstring_flag_pairs): New.
+	(nsstring_flag_specs): New.
+	(nsstring_char_table): New.
+	(check_format_arg): Implemented nsstring_format_type.
+	(check_format_info_main): Implemented '@' as valid flag2.
+	(check_format_types): Implemented objc_object_flag.
+	* c-format.h: Updated comments.
+	(T89_AT): New.	
+	
+2010-10-31  Nicola Pero  <nicola.pero@meta-innovation.com>
+
 	Implemented format and noreturn attributes for Objective-C methods.
 	* c-common.c (handle_noreturn_attribute): Recognize 'noreturn'
 	attribute for Objective-C methods.
diff -rupN --exclude=.svn trunk9/gcc/c-family/stub-objc.c trunk8/gcc/c-family/stub-objc.c
--- trunk9/gcc/c-family/stub-objc.c	2010-10-30 12:26:13.000000000 +0100
+++ trunk8/gcc/c-family/stub-objc.c	2010-11-01 02:54:50.000000000 +0000
@@ -44,6 +44,12 @@  objc_is_object_ptr (tree ARG_UNUSED (arg
   return 0;
 }
 
+bool
+objc_is_string_type (tree ARG_UNUSED (arg))
+{
+  return 0;
+}
+
 bool objc_diagnose_private_ivar (tree ARG_UNUSED (arg))
 {
   return false;
@@ -281,6 +287,12 @@  objc_build_string_object (tree ARG_UNUSE
 }
 
 tree
+objc_maybe_extract_string_cst_from_objc_string (tree ARG_UNUSED (str))
+{
+  return 0;
+}
+
+tree
 objc_get_class_reference (tree ARG_UNUSED (name))
 {
   return 0;
diff -rupN --exclude=.svn trunk9/gcc/objc/ChangeLog trunk8/gcc/objc/ChangeLog
--- trunk9/gcc/objc/ChangeLog	2010-10-31 17:36:23.000000000 +0000
+++ trunk8/gcc/objc/ChangeLog	2010-11-01 04:23:36.000000000 +0000
@@ -1,5 +1,10 @@ 
 2010-10-31  Nicola Pero  <nicola.pero@meta-innovation.com>
 
+	* objc-act.c (objc_is_string_type): New.
+	(objc_maybe_extract_string_cst_from_objc_string): New.
+	
+2010-10-31  Nicola Pero  <nicola.pero@meta-innovation.com>
+
 	Implemented format and noreturn attributes for Objective-C methods.
 	* objc-act.c (objc_start_method_definition): If method attributes
 	are specified emit a warning and ignore them.
diff -rupN --exclude=.svn trunk9/gcc/objc/objc-act.c trunk8/gcc/objc/objc-act.c
--- trunk9/gcc/objc/objc-act.c	2010-10-31 17:52:48.000000000 +0000
+++ trunk8/gcc/objc/objc-act.c	2010-11-01 04:23:27.000000000 +0000
@@ -2611,6 +2611,61 @@  objc_build_string_object (tree string)
   return addr;
 }
 
+/* This is the inversion of objc_build_string_object: examine
+   'objc_string', and if it is an Objective-C constant string created
+   by objc_build_string_object, extract the C string embedded into it
+   and return it as a STRING_CST node.  Return 'objc_string' if
+   'objc_string' is not an Objective-C constant string, or if we could
+   not extract the STRING_CST.  */
+tree
+objc_maybe_extract_string_cst_from_objc_string (tree objc_string)
+{
+  tree constructor, expr;
+  VEC(constructor_elt,gc) *elts;
+
+  /* This is a simple implementation.  */
+  /* TODO: Better error checking.  */
+  if (TREE_CODE (objc_string) != VAR_DECL
+      || TREE_CODE (TREE_TYPE (objc_string)) != RECORD_TYPE)
+    return objc_string;
+
+  constructor = DECL_INITIAL (objc_string);
+
+  if (!TREE_READONLY (constructor)
+      || !TREE_CONSTANT (constructor)
+      || !TREE_STATIC (constructor))
+    return objc_string;
+
+  if (TREE_CODE (constructor) != CONSTRUCTOR)
+    return objc_string;
+
+  /* Get the second field (the first one is 'isa', the second one is
+     the constant string, the third one is the length.  */
+  elts = CONSTRUCTOR_ELTS (constructor);
+  if (VEC_length (constructor_elt, elts) != 3)
+    return objc_string;
+
+  expr = VEC_index (constructor_elt, elts, 1)->value;
+
+  if (!expr || TREE_CODE (expr) != ADDR_EXPR)
+    {
+      printf ("expr not ADDR_EXPR\n");
+      debug_tree (expr);
+      return objc_string;
+    }
+
+  expr = TREE_OPERAND (expr, 0);
+
+  if (!expr || TREE_CODE (expr) != STRING_CST)
+    {
+      printf ("expr not STRING_CST\n");
+      debug_tree (expr);
+      return objc_string;
+    }
+
+  return expr;
+}
+
 /* Declare a static instance of CLASS_DECL initialized by CONSTRUCTOR.  */
 
 static GTY(()) int num_static_inst;
@@ -3667,6 +3722,55 @@  objc_is_object_ptr (tree type)
   return ret;
 }
 
+/* Check whether TYPE is an Objective-C string class; at the moment,
+   this means NSString, NSMutableString or the constant string class.
+   This is used to check that a function or method marked with
+   __attribute__ ((format (NSString, 1, 2))) has the correct prototype
+   with an (NSString *) argument for the format string.  */
+
+bool
+objc_is_string_type (tree type)
+{
+  tree objc_type_name;
+
+  /* First, determine that it looks roughly like an Objective-C object
+     type.  */
+  type = TYPE_MAIN_VARIANT (type);
+  if (!POINTER_TYPE_P (type))
+    return false;
+
+  /* We determined it is a pointer; get what it points to.  */
+  type = TREE_TYPE (type);
+
+  /* It should point to a RECORD_TYPE (the Class).  */
+  if (TREE_CODE (type) != RECORD_TYPE)
+    return false;
+
+  /* Now, try to determine if it is an Objective-C string type.  A
+     careful check (TODO) would look at the class hierarchy and accept
+     any Objective-C class that inherits from NSString.  In practice,
+     this function is used to check the string format argument in an
+     NSString format type, which is practically always NSString, so we
+     can just check that the class name is NSString.  To be a bit more
+     flexible, we also accept NSMutableString and the constant string
+     class name.  */
+  objc_type_name = OBJC_TYPE_NAME (type);
+
+  if (objc_type_name  &&  TREE_CODE (objc_type_name) == IDENTIFIER_NODE)
+    {
+      const char * name = IDENTIFIER_POINTER (objc_type_name);
+      if (TYPE_HAS_OBJC_INFO (type)  &&  TYPE_OBJC_INTERFACE (type))
+	{
+	  if (strcmp (name, constant_string_class_name) == 0
+	      || strncmp (name, "NSString", 8) == 0
+	      || strncmp (name, "NSMutableString", 15) == 0)
+	    return true;
+	}
+    }
+
+  return false;
+}
+
 static int
 objc_is_gcable_type (tree type, int or_strong_p)
 {
diff -rupN --exclude=.svn trunk9/gcc/testsuite/ChangeLog trunk8/gcc/testsuite/ChangeLog
--- trunk9/gcc/testsuite/ChangeLog	2010-10-31 17:59:55.000000000 +0000
+++ trunk8/gcc/testsuite/ChangeLog	2010-11-01 04:23:59.000000000 +0000
@@ -1,4 +1,8 @@ 
 2010-10-31  Nicola Pero  <nicola.pero@meta-innovation.com>
+	
+	* objc.dg/attributes/method-format-2.m: New.
+
+2010-10-31  Nicola Pero  <nicola.pero@meta-innovation.com>
 
 	Implemented format and noreturn attributes for Objective-C methods.
 	* objc.dg/attributes/method-attribute-2.m: Updated warnings.
diff -rupN --exclude=.svn trunk9/gcc/testsuite/objc.dg/attributes/method-format-2.m trunk8/gcc/testsuite/objc.dg/attributes/method-format-2.m
--- trunk9/gcc/testsuite/objc.dg/attributes/method-format-2.m	1970-01-01 01:00:00.000000000 +0100
+++ trunk8/gcc/testsuite/objc.dg/attributes/method-format-2.m	2010-11-01 04:28:07.000000000 +0000
@@ -0,0 +1,59 @@ 
+/* Contributed by Nicola Pero <nicola.pero@meta-innovation.com>, October 2010.  */
+/* { dg-do compile } */
+/* { dg-options "-Wall -fconstant-string-class=MyString" } */
+
+#include <objc/objc.h>
+#include <stdlib.h>
+
+@interface MyString
+{
+@public
+  Class isa;
+  char *c_string;
+  unsigned int len;
+}
+@end
+
+@implementation MyString
+@end
+
+@interface LogObject
+{
+  Class isa;
+} 
++ (void) log: (int)level  message: (MyString *) my_format, ...  __attribute__ ((format (NSString, 2, 3)));
+- (void) log: (int)level  message: (MyString *) my_format, ...  __attribute__ ((format (NSString, 2, 3)));
+
++ (void) debug: (MyString *) my_format, ...  __attribute__ ((format (NSString, 1, 2)));
+- (void) debug: (MyString *) my_format, ...  __attribute__ ((format (NSString, 1, 2)));
+
+/* Just make sure a missing or invalid attribute won't crash the compiler.  */
+- (void) log2: (int)level  message: (MyString *) my_format, ...  __attribute__ ((format (NSString, 2)));      /* { dg-error "wrong" } */
+- (void) log3: (int)level  message: (const char *) my_format, ...  __attribute__ ((format (NSString, 2, 3))); /* { dg-error "not an Objective.C string type" } */
+- (void) log4: (int)level  message: (LogObject *) my_format, ...  __attribute__ ((format (NSString, 2, 3)));  /* { dg-error "not an Objective.C string type" } */
++ (void) debug2: (MyString *) my_format, ...  __attribute__ ((format (NSString))); /* { dg-error "wrong" } */
+- (void) debug2: (MyString *) my_format, ...  __attribute__ ((format (NSString))); /* { dg-error "wrong" } */
++ (void) alert: (MyString *) my_format __attribute__ ((format (NSString, 1, 2))); /* { dg-error "args to be formatted is not ..." } */
+- (void) alert: (MyString *) my_format __attribute__ ((format (NSString, 1, 2))); /* { dg-error "args to be formatted is not ..." } */
+@end
+
+void test (LogObject *object)
+{
+  [object log: 2  message: @"attribute only applies to variadic functions"];
+  [object log: 2  message: @"%d: attribute %@ only applies to variadic functions", 23, @"'format'"];
+  [object log: 2  message: @"%d: attribute %@ only applies to variadic functions", 23]; /* { dg-warning "too few arguments for format" } */
+  [object log: 2  message: @"%d: attribute %@ only applies to variadic functions", @"'format'", 23]; /* { dg-warning "format ..@. expects" } */
+
+  [object debug: @"attribute only applies to variadic functions"];
+  [object debug: @"%d:%d: attribute %@ only applies to variadic functions", 23, 11, @"'format'"];
+  [object debug: @"%d:%d: attribute %@ only applies to variadic functions", 23, @"'format'"]; /* { dg-warning "too few arguments for format" } */
+  [object debug: @"%d:%d: attribute %@ only applies to variadic functions", 23, @"'format'", 11]; /* { dg-warning "format ..@. expects" } */
+
+  [LogObject log: 2  message: @"attribute only applies to variadic functions"];
+  [LogObject log: 2  message: @"attribute %@ only applies to variadic functions", @"'format'"];
+  [LogObject log: 2  message: @"attribute %@ only applies to variadic functions"]; /* { dg-warning "too few arguments for format" } */
+
+  [LogObject debug: @"attribute only applies to variadic functions"];
+  [LogObject debug: @"attribute %@ only applies to variadic functions", @"'format'"];
+  [LogObject debug: @"attribute %@ only applies to variadic functions"]; /* { dg-warning "too few arguments for format" } */
+}