diff mbox

[RFC] Extend __attribute__((format)) with user-specified conversion sequences

Message ID 4F773009.5090504@seas.harvard.edu
State New
Headers show

Commit Message

Eddie Kohler March 31, 2012, 4:25 p.m. UTC
Hi,

Projects not uncommonly extend printf and/or scanf with new conversion 
characters, such gcc's own %< and %> for fancy quotes. Unfortunately, new 
conversion characters make __attribute__((format)) unusable for these 
functions, so errors go uncaught.

This patch adds an extra optional argument to __attribute__((format)) that 
defines new conversion characters.

The syntax is very simple. The extra argument, an even-length C string 
constant, is interpreted as a set of character pairs. For example, "<%" says 
"interpret the character '<' like you would '%'": as a conversion specifier 
that consumes no arguments from the argument list. "Ad" says "interpret 'A' 
like 'd'": as a conversion specifier that consumes an int. "<%>%,%;0" says 
'<', '>', and ',' are zero-argument conversion specifiers, and ';' is a flag 
like '0'.

This is actually pretty flexible, simple to implement, and suffices for my use 
(the Click modular router). Perhaps you see obvious ways I could improve it? A 
full-fledged little language for format string definitions seems like too much 
work for now, unfortunately, but the character-pair syntax could be extended 
later.

Thanks for any comments or feedback,
Eddie Kohler
commit 97fb50f9dd205266ea7bb6e719105909c0d87f80
Author: Eddie Kohler <ekohler@gmail.com>
Date:   Sat Mar 31 11:44:38 2012 -0400

    gcc/c-family/
        * c-format.c (check_format_info_main): Support character mappings:
        users can change what characters mean per attribute.
        (create_dynamic_format_type): New function.
        (decode_format_attr): Use it when parsing attributes.
        * c-format.h (format_kind_info): Add char_map member.
        * c-common.c (c_common_format_attribute_table): Support it.
    
    gcc/doc/
        * extend.texi (function attributes): Document it.
    
    Signed-off-by: Eddie Kohler <ekohler@gmail.com>

Comments

Joseph Myers April 8, 2012, 10:25 a.m. UTC | #1
On Sat, 31 Mar 2012, Eddie Kohler wrote:

> The syntax is very simple. The extra argument, an even-length C string
> constant, is interpreted as a set of character pairs. For example, "<%" says
> "interpret the character '<' like you would '%'": as a conversion specifier
> that consumes no arguments from the argument list. "Ad" says "interpret 'A'
> like 'd'": as a conversion specifier that consumes an int. "<%>%,%;0" says
> '<', '>', and ',' are zero-argument conversion specifiers, and ';' is a flag
> like '0'.
> 
> This is actually pretty flexible, simple to implement, and suffices for my use
> (the Click modular router). Perhaps you see obvious ways I could improve it? A
> full-fledged little language for format string definitions seems like too much
> work for now, unfortunately, but the character-pair syntax could be extended
> later.

Character pairs like this don't seem very extensible, in that you are 
providing meanings for any even-length string, rather than (for example) 
only a limited subset of strings leaving room for meanings to be assigned 
later to other strings.

They also have the obvious problem of not covering application-specific 
types as arguments, only types that have corresponding standard formats.

In principle we want extensibility of format checking, and want it to be 
as flexible as the built-in checking is regarding the peculiarities of 
different formats - but we also don't want to export implementation 
details of format checking to users' source code, and the two points seem 
rather to contradict each others.  So my recent inclination has been that 
we should make it possible for plugins to add new format checking types 
(but the details of the relevant interfaces would be unstable, so such 
plugins might need to change for each GCC version).  That means a function 
for a plugin to register a new format type - and probably a callback 
called when that format type is used for a function declaration that can 
look for a typedef name in the same way that the existing GCC-internal 
formats are handled.
Mike Stump April 8, 2012, 5:02 p.m. UTC | #2
On Apr 8, 2012, at 3:25 AM, Joseph S. Myers wrote:
> In principle we want extensibility of format checking, and want it to be 
> as flexible as the built-in checking is regarding the peculiarities of 
> different formats - but we also don't want to export implementation 
> details of format checking to users' source code, and the two points seem 
> rather to contradict each others.  So my recent inclination has been that 
> we should make it possible for plugins to add new format checking types 
> (but the details of the relevant interfaces would be unstable, so such 
> plugins might need to change for each GCC version).

Longer term, we  can hope that the interface evolves  into something stable and pretty.  For the plugin people that don't want the interface churn, a little time now in this area hopefully would translate into longer term stability.
Eddie Kohler April 9, 2012, 8:24 p.m. UTC | #3
Hi, thanks for the reply.

On 4/8/12 6:25 AM, Joseph S. Myers wrote:
> Character pairs like this don't seem very extensible, in that you are
> providing meanings for any even-length string, rather than (for example)
> only a limited subset of strings leaving room for meanings to be assigned
> later to other strings.

Future extensions could use identifiers rather than strings, or other 
syntax, but understood.

> They also have the obvious problem of not covering application-specific
> types as arguments, only types that have corresponding standard formats.

Agreed.

> In principle we want extensibility of format checking, and want it to be
> as flexible as the built-in checking is regarding the peculiarities of
> different formats - but we also don't want to export implementation
> details of format checking to users' source code, and the two points seem
> rather to contradict each others.  So my recent inclination has been that
> we should make it possible for plugins to add new format checking types
> (but the details of the relevant interfaces would be unstable, so such
> plugins might need to change for each GCC version).  That means a function
> for a plugin to register a new format type - and probably a callback
> called when that format type is used for a function declaration that can
> look for a typedef name in the same way that the existing GCC-internal
> formats are handled.

The plugin architecture would work great for format strings that are 
very different from printf/scanf, but seems heavyweight for format 
strings that are close to printf/scanf.

Would a better syntax for "printf/scanf + extensions" format strings be 
worth accepting independent of plugins?

For instance:

__attribute__(printf, 1, 1, '<' ('%'), ';' ('0'))
==> treat '<' like '%' and ';' like '0'

__attribute__(printf, 1, 1, '<' (), 'F' (expr *))
==> '<' takes 0 arguments, 'F' takes an expr * argument in the list

Best,
Eddie
diff mbox

Patch

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index fc83b04..597f084 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -748,7 +748,7 @@  const struct attribute_spec c_common_format_attribute_table[] =
 {
   /* { name, min_len, max_len, decl_req, type_req, fn_type_req, handler,
        affects_type_identity } */
-  { "format",                 3, 3, false, true,  true,
+  { "format",                 3, 4, false, true,  true,
 			      handle_format_attribute, false },
   { "format_arg",             1, 1, false, true,  true,
 			      handle_format_arg_attribute, false },
diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
index 9fabc39..cdebfcf 100644
--- a/gcc/c-family/c-format.c
+++ b/gcc/c-family/c-format.c
@@ -76,6 +76,7 @@  typedef struct function_format_info
 
 static bool decode_format_attr (tree, function_format_info *, int);
 static int decode_format_type (const char *);
+static int create_dynamic_format_type (int format_type, const char *char_map);
 
 static bool check_format_string (tree argument,
 				 unsigned HOST_WIDE_INT format_num,
@@ -274,6 +275,9 @@  decode_format_attr (tree args, function_format_info *info, int validated_p)
   tree format_num_expr = TREE_VALUE (TREE_CHAIN (args));
   tree first_arg_num_expr
     = TREE_VALUE (TREE_CHAIN (TREE_CHAIN (args)));
+  tree char_map_expr = TREE_CHAIN (TREE_CHAIN (TREE_CHAIN (args)));
+  if (char_map_expr)
+    char_map_expr = TREE_VALUE (char_map_expr);
 
   if (TREE_CODE (format_type_id) != IDENTIFIER_NODE)
     {
@@ -320,6 +324,20 @@  decode_format_attr (tree args, function_format_info *info, int validated_p)
       return false;
     }
 
+  if (char_map_expr != NULL)
+    {
+      if (TREE_CODE (char_map_expr) != STRING_CST
+	  || (TREE_STRING_LENGTH (char_map_expr) % 2) != 1
+	  || TREE_STRING_POINTER (char_map_expr)[TREE_STRING_LENGTH (char_map_expr)] != 0)
+	{
+	  gcc_assert (!validated_p);
+	  error ("%<...%> has invalid format character map");
+	  return false;
+	}
+      info->format_type = create_dynamic_format_type (info->format_type,
+						      TREE_STRING_POINTER (char_map_expr));
+    }
+
   if (info->first_arg_num != 0 && info->first_arg_num <= info->format_num)
     {
       gcc_assert (!validated_p);
@@ -831,64 +849,64 @@  static const format_kind_info format_types_orig[] =
     printf_flag_specs, printf_flag_pairs,
     FMT_FLAG_ARG_CONVERT|FMT_FLAG_DOLLAR_MULTIPLE|FMT_FLAG_USE_DOLLAR|FMT_FLAG_EMPTY_PREC_OK,
     'w', 0, 'p', 0, 'L', 0,
-    &integer_type_node, &integer_type_node
+    &integer_type_node, &integer_type_node, NULL
   },
   { "asm_fprintf",   asm_fprintf_length_specs,  asm_fprintf_char_table, " +#0-", NULL,
     asm_fprintf_flag_specs, asm_fprintf_flag_pairs,
     FMT_FLAG_ARG_CONVERT|FMT_FLAG_EMPTY_PREC_OK,
     'w', 0, 'p', 0, 'L', 0,
-    NULL, NULL
+    NULL, NULL, NULL
   },
   { "gcc_diag",   gcc_diag_length_specs,  gcc_diag_char_table, "q+#", NULL,
     gcc_diag_flag_specs, gcc_diag_flag_pairs,
     FMT_FLAG_ARG_CONVERT,
     0, 0, 'p', 0, 'L', 0,
-    NULL, &integer_type_node
+    NULL, &integer_type_node, NULL
   },
   { "gcc_tdiag",   gcc_tdiag_length_specs,  gcc_tdiag_char_table, "q+#", NULL,
     gcc_tdiag_flag_specs, gcc_tdiag_flag_pairs,
     FMT_FLAG_ARG_CONVERT,
     0, 0, 'p', 0, 'L', 0,
-    NULL, &integer_type_node
+    NULL, &integer_type_node, NULL
   },
   { "gcc_cdiag",   gcc_cdiag_length_specs,  gcc_cdiag_char_table, "q+#", NULL,
     gcc_cdiag_flag_specs, gcc_cdiag_flag_pairs,
     FMT_FLAG_ARG_CONVERT,
     0, 0, 'p', 0, 'L', 0,
-    NULL, &integer_type_node
+    NULL, &integer_type_node, NULL
   },
   { "gcc_cxxdiag",   gcc_cxxdiag_length_specs,  gcc_cxxdiag_char_table, "q+#", NULL,
     gcc_cxxdiag_flag_specs, gcc_cxxdiag_flag_pairs,
     FMT_FLAG_ARG_CONVERT,
     0, 0, 'p', 0, 'L', 0,
-    NULL, &integer_type_node
+    NULL, &integer_type_node, NULL
   },
   { "gcc_gfc", gcc_gfc_length_specs, gcc_gfc_char_table, "", NULL,
     NULL, gcc_gfc_flag_pairs,
     FMT_FLAG_ARG_CONVERT,
     0, 0, 0, 0, 0, 0,
-    NULL, NULL
+    NULL, NULL, NULL
   },
   { "NSString",   NULL,  NULL, NULL, NULL,
     NULL, NULL,
     FMT_FLAG_ARG_CONVERT|FMT_FLAG_PARSE_ARG_CONVERT_EXTERNAL, 0, 0, 0, 0, 0, 0,
-    NULL, NULL
+    NULL, NULL, NULL
   },
   { "gnu_scanf",    scanf_length_specs,   scan_char_table,  "*'I", NULL,
     scanf_flag_specs, scanf_flag_pairs,
     FMT_FLAG_ARG_CONVERT|FMT_FLAG_SCANF_A_KLUDGE|FMT_FLAG_USE_DOLLAR|FMT_FLAG_ZERO_WIDTH_BAD|FMT_FLAG_DOLLAR_GAP_POINTER_OK,
     'w', 0, 0, '*', 'L', 'm',
-    NULL, NULL
+    NULL, NULL, NULL
   },
   { "gnu_strftime", NULL,                 time_char_table,  "_-0^#", "EO",
     strftime_flag_specs, strftime_flag_pairs,
     FMT_FLAG_FANCY_PERCENT_OK, 'w', 0, 0, 0, 0, 0,
-    NULL, NULL
+    NULL, NULL, NULL
   },
   { "gnu_strfmon",  strfmon_length_specs, monetary_char_table, "=^+(!-", NULL,
     strfmon_flag_specs, strfmon_flag_pairs,
     FMT_FLAG_ARG_CONVERT, 'w', '#', 'p', 0, 'L', 0,
-    NULL, NULL
+    NULL, NULL, NULL
   }
 };
 
@@ -901,6 +919,7 @@  static const format_kind_info *format_types = format_types_orig;
 static format_kind_info *dynamic_format_types;
 
 static int n_format_types = ARRAY_SIZE (format_types_orig);
+static int format_types_capacity = ARRAY_SIZE (format_types_orig);
 
 /* Structure detailing the results of checking a format function call
    where the format expression may be a conditional expression with
@@ -1605,6 +1624,8 @@  check_format_arg (void *ctx, tree format_tree,
 }
 
 
+#define FCHAR(c) (char_map ? char_map[(unsigned char) (c)] : (c))
+
 /* Do the main part of checking a call to a format function.  FORMAT_CHARS
    is the NUL-terminated format string (which at this point may contain
    internal NUL characters); FORMAT_LENGTH is its length (excluding the
@@ -1624,6 +1645,18 @@  check_format_info_main (format_check_results *res,
   const format_kind_info *fki = &format_types[info->format_type];
   const format_flag_spec *flag_specs = fki->flag_specs;
   const format_flag_pair *bad_flag_pairs = fki->bad_flag_pairs;
+  const char *char_map = NULL;
+  char char_map_storage[256];
+  if (fki->char_map)
+    {
+      int i;
+      const char *x;
+      for (i = 0; i < 256; i++)
+	char_map_storage[i] = i;
+      for (x = fki->char_map; *x; x += 2)
+	char_map_storage[(unsigned char) x[0]] = x[1];
+      char_map = char_map_storage;
+    }
 
   /* -1 if no conversions taking an operand have been found; 0 if one has
      and it didn't use $; 1 if $ formats are in use.  */
@@ -1664,7 +1697,7 @@  check_format_info_main (format_check_results *res,
 	  warning (OPT_Wformat, "spurious trailing %<%%%> in format");
 	  continue;
 	}
-      if (*format_chars == '%')
+      if (FCHAR (*format_chars) == '%')
 	{
 	  ++format_chars;
 	  continue;
@@ -1699,18 +1732,18 @@  check_format_info_main (format_check_results *res,
 	 duplicates, since in general validation depends on the rest of
 	 the format.  */
       while (*format_chars != 0
-	     && strchr (fki->flag_chars, *format_chars) != 0)
+	     && strchr (fki->flag_chars, FCHAR (*format_chars)) != 0)
 	{
 	  const format_flag_spec *s = get_flag_spec (flag_specs,
-						     *format_chars, NULL);
-	  if (strchr (flag_chars, *format_chars) != 0)
+						     FCHAR (*format_chars), NULL);
+	  if (strchr (flag_chars, FCHAR (*format_chars)) != 0)
 	    {
 	      warning (OPT_Wformat, "repeated %s in format", _(s->name));
 	    }
 	  else
 	    {
 	      i = strlen (flag_chars);
-	      flag_chars[i++] = *format_chars;
+	      flag_chars[i++] = FCHAR (*format_chars);
 	      flag_chars[i] = 0;
 	    }
 	  if (s->skip_next_char)
@@ -1728,7 +1761,7 @@  check_format_info_main (format_check_results *res,
       /* Read any format width, possibly * or *m$.  */
       if (fki->width_char != 0)
 	{
-	  if (fki->width_type != NULL && *format_chars == '*')
+	  if (fki->width_type != NULL && FCHAR (*format_chars) == '*')
 	    {
 	      i = strlen (flag_chars);
 	      flag_chars[i++] = fki->width_char;
@@ -1817,7 +1850,7 @@  check_format_info_main (format_check_results *res,
 	}
 
       /* Read any format left precision (must be a number, not *).  */
-      if (fki->left_precision_char != 0 && *format_chars == '#')
+      if (fki->left_precision_char != 0 && FCHAR (*format_chars) == '#')
 	{
 	  ++format_chars;
 	  i = strlen (flag_chars);
@@ -1830,13 +1863,13 @@  check_format_info_main (format_check_results *res,
 	}
 
       /* Read any format precision, possibly * or *m$.  */
-      if (fki->precision_char != 0 && *format_chars == '.')
+      if (fki->precision_char != 0 && FCHAR (*format_chars) == '.')
 	{
 	  ++format_chars;
 	  i = strlen (flag_chars);
 	  flag_chars[i++] = fki->precision_char;
 	  flag_chars[i] = 0;
-	  if (fki->precision_type != NULL && *format_chars == '*')
+	  if (fki->precision_type != NULL && FCHAR (*format_chars) == '*')
 	    {
 	      /* "...a...precision...may be indicated by an asterisk.
 		 In this case, an int argument supplies the...precision."  */
@@ -1907,7 +1940,7 @@  check_format_info_main (format_check_results *res,
 	}
 
       format_start = format_chars;
-      if (fki->alloc_char && fki->alloc_char == *format_chars)
+      if (fki->alloc_char && fki->alloc_char == FCHAR (*format_chars))
 	{
 	  i = strlen (flag_chars);
 	  flag_chars[i++] = fki->alloc_char;
@@ -1918,7 +1951,7 @@  check_format_info_main (format_check_results *res,
       /* Handle the scanf allocation kludge.  */
       if (fki->flags & (int) FMT_FLAG_SCANF_A_KLUDGE)
 	{
-	  if (*format_chars == 'a' && !flag_isoc99)
+	  if (FCHAR (*format_chars) == 'a' && !flag_isoc99)
 	    {
 	      if (format_chars[1] == 's' || format_chars[1] == 'S'
 		  || format_chars[1] == '[')
@@ -1940,13 +1973,20 @@  check_format_info_main (format_check_results *res,
       scalar_identity_flag = 0;
       if (fli)
 	{
-	  while (fli->name != 0
- 		 && strncmp (fli->name, format_chars, strlen (fli->name)))
+	  while (fli->name != 0)
+	    {
+	      int fli_pos = 0;
+	      while (fli->name[fli_pos]
+		     && FCHAR (format_chars[fli_pos]) == fli->name[fli_pos])
+		fli_pos++;
+	      if (!fli->name[fli_pos])
+		break;
 	      fli++;
+	    }
 	  if (fli->name != 0)
 	    {
  	      format_chars += strlen (fli->name);
-	      if (fli->double_name != 0 && fli->name[0] == *format_chars)
+	      if (fli->double_name != 0 && fli->name[0] == FCHAR (*format_chars))
 		{
 		  format_chars++;
 		  length_chars = fli->double_name;
@@ -1979,25 +2019,25 @@  check_format_info_main (format_check_results *res,
       if (fki->modifier_chars != NULL)
 	{
 	  while (*format_chars != 0
-		 && strchr (fki->modifier_chars, *format_chars) != 0)
+		 && strchr (fki->modifier_chars, FCHAR (*format_chars)) != 0)
 	    {
-	      if (strchr (flag_chars, *format_chars) != 0)
+	      if (strchr (flag_chars, FCHAR (*format_chars)) != 0)
 		{
 		  const format_flag_spec *s = get_flag_spec (flag_specs,
-							     *format_chars, NULL);
+							     FCHAR (*format_chars), NULL);
 		  warning (OPT_Wformat, "repeated %s in format", _(s->name));
 		}
 	      else
 		{
 		  i = strlen (flag_chars);
-		  flag_chars[i++] = *format_chars;
+		  flag_chars[i++] = FCHAR (*format_chars);
 		  flag_chars[i] = 0;
 		}
 	      ++format_chars;
 	    }
 	}
 
-      format_char = *format_chars;
+      format_char = FCHAR (*format_chars);
       if (format_char == 0
 	  || (!(fki->flags & (int) FMT_FLAG_FANCY_PERCENT_OK)
 	      && format_char == '%'))
@@ -2138,15 +2178,15 @@  check_format_info_main (format_check_results *res,
       if (strchr (fci->flags2, '[') != 0)
 	{
 	  /* Skip over scan set, in case it happens to have '%' in it.  */
-	  if (*format_chars == '^')
+	  if (FCHAR (*format_chars) == '^')
 	    ++format_chars;
 	  /* Find closing bracket; if one is hit immediately, then
 	     it's part of the scan set rather than a terminator.  */
-	  if (*format_chars == ']')
+	  if (FCHAR (*format_chars) == ']')
 	    ++format_chars;
-	  while (*format_chars && *format_chars != ']')
+	  while (*format_chars && FCHAR (*format_chars) != ']')
 	    ++format_chars;
-	  if (*format_chars != ']')
+	  if (FCHAR (*format_chars) != ']')
 	    /* The end of the format string was reached.  */
 	    warning (OPT_Wformat, "no closing %<]%> for %<%%[%> format");
 	}
@@ -2948,14 +2988,10 @@  cmp_attribs (const char *tattr_name, const char *attr_name)
   return true;
 }
 
-/* Handle a "format" attribute; arguments as in
-   struct attribute_spec.handler.  */
-tree
-handle_format_attribute (tree *node, tree ARG_UNUSED (name), tree args,
-			 int flags, bool *no_add_attrs)
+int
+create_dynamic_format_type (int format_type, const char *char_map)
 {
-  tree type = *node;
-  function_format_info info;
+  int i;
 
 #ifdef TARGET_FORMAT_TYPES
   /* If the target provides additional format types, we need to
@@ -2973,7 +3009,54 @@  handle_format_attribute (tree *node, tree ARG_UNUSED (name), tree args,
       /* Provide a reference for the first potential external type.  */
       first_target_format_type = n_format_types;
       n_format_types += TARGET_N_FORMAT_TYPES;
+      format_types_capacity = n_format_types;
+    }
+#endif
+
+  if (!char_map)
+    return 0;
+
+  for (i = 0; i < n_format_types; i++)
+    if (format_types[i].name == format_types[format_type].name
+	&& format_types[i].char_map
+	&& !strcmp (format_types[i].char_map, char_map))
+      return i;
+
+  if (n_format_types >= format_types_capacity)
+    {
+      format_kind_info *new_types = XNEWVEC (format_kind_info,
+					     format_types_capacity + 32);
+      memcpy (new_types, format_types,
+	      n_format_types * sizeof (format_types[0]));
+      if (format_types == dynamic_format_types)
+	XDELETEVEC (dynamic_format_types);
+      format_types = dynamic_format_types = new_types;
+      format_types_capacity += 32;
     }
+
+  i = n_format_types;
+  memcpy (&dynamic_format_types[i], &format_types[format_type],
+	  sizeof (format_types[i]));
+  dynamic_format_types[i].char_map = xstrdup (char_map);
+  n_format_types++;
+  return i;
+}
+
+/* Handle a "format" attribute; arguments as in
+   struct attribute_spec.handler.  */
+tree
+handle_format_attribute (tree *node, tree ARG_UNUSED (name), tree args,
+			 int flags, bool *no_add_attrs)
+{
+  tree type = *node;
+  function_format_info info;
+  tree argument;
+
+#ifdef TARGET_FORMAT_TYPES
+  /* If the target provides additional format types, we need to
+     add them to FORMAT_TYPES at first use.  */
+  if (TARGET_FORMAT_TYPES != NULL && !dynamic_format_types)
+    create_dynamic_format_type (0, NULL);
 #endif
 
   if (!decode_format_attr (args, &info, 0))
diff --git a/gcc/c-family/c-format.h b/gcc/c-family/c-format.h
index 286219b..fe586c1 100644
--- a/gcc/c-family/c-format.h
+++ b/gcc/c-family/c-format.h
@@ -252,6 +252,9 @@  typedef struct
   /* Pointer to type of argument expected if '*' is used for a precision,
      or NULL if '*' not used for precisions.  */
   tree *precision_type;
+  /* String listing character mapping pairs.  The first character in each
+     pair will be treated like the second.  */
+  char *char_map;
 } format_kind_info;
 
 #define T_I	&integer_type_node
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index bb43825..8f1ee4c 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -2524,6 +2524,7 @@  As gcc extension this calling convention can be used for C-functions
 and for static member methods.
 
 @item format (@var{archetype}, @var{string-index}, @var{first-to-check})
+@itemx format (@var{archetype}, @var{string-index}, @var{first-to-check}, @var{character-map})
 @cindex @code{format} function attribute
 @opindex Wformat
 The @code{format} attribute specifies that a function takes @code{printf},
@@ -2594,6 +2595,15 @@  will be parsed for correct syntax, however the result of checking of such format
 strings is not yet defined, and will not be carried out by this version of the
 compiler.
 
+The optional parameter @var{character-map} is a string literal that
+changes how the format string is interpreted.  A @var{character-map}
+consists of character pairs; when checking a format escape, GCC will
+treat the first character of each pair as if it were the second.  For
+example, the @var{character-map} @code{"<%"} tells GCC that the escape
+@code{"%<"} should be treated like @code{"%%"}, and @code{"AdBdCd"}
+says that @code{A}, @code{B}, and @code{C} each take an integer
+argument, like @code{"%d"}.
+
 The target may also provide additional types of format checks.
 @xref{Target Format Checks,,Format Checks Specific to Particular
 Target Machines}.
diff --git a/gcc/testsuite/gcc.dg/format/attr-3.c b/gcc/testsuite/gcc.dg/format/attr-3.c
index bee5ff4..0626bef 100644
--- a/gcc/testsuite/gcc.dg/format/attr-3.c
+++ b/gcc/testsuite/gcc.dg/format/attr-3.c
@@ -17,7 +17,8 @@  extern void fb0 (const char *, ...) __attribute__((format)); /* { dg-error "wron
 extern void fb1 (const char *, ...) __attribute__((format())); /* { dg-error "wrong number of arguments" "bad format" } */
 extern void fb2 (const char *, ...) __attribute__((format(gnu_attr_printf))); /* { dg-error "wrong number of arguments" "bad format" } */
 extern void fb3 (const char *, ...) __attribute__((format(gnu_attr_printf, 1))); /* { dg-error "wrong number of arguments" "bad format" } */
-extern void fb4 (const char *, ...) __attribute__((format(gnu_attr_printf, 1, 2, 3))); /* { dg-error "wrong number of arguments" "bad format" } */
+extern void fb4 (const char *, ...) __attribute__((format(gnu_attr_printf, 1, 2, 3))); /* { dg-error "invalid format character map" "bad format" } */
+extern void fb5 (const char *, ...) __attribute__((format(gnu_attr_printf, 1, 2, "ab", 3))); /* { dg-error "wrong number of arguments" "bad format" } */
 
 extern void fc1 (const char *) __attribute__((format_arg)); /* { dg-error "wrong number of arguments" "bad format_arg" } */
 extern void fc2 (const char *) __attribute__((format_arg())); /* { dg-error "wrong number of arguments" "bad format_arg" } */
diff --git a/gcc/testsuite/gcc.dg/format/charmap-1.c b/gcc/testsuite/gcc.dg/format/charmap-1.c
new file mode 100644
index 0000000..c0ba1e0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/format/charmap-1.c
@@ -0,0 +1,63 @@ 
+/* Test for format attributes: test use of __attribute__.  */
+/* Origin: Eddie Kohler <kohler@seas.harvard.edu> */
+/* { dg-do compile } */
+/* { dg-options "-std=gnu99 -Wformat" } */
+
+#define DONT_GNU_PROTOTYPE
+#include "format.h"
+
+extern void printf1 (const char *, ...) __attribute__((format(gnu_attr_printf, 1, 2)));
+extern void printf2 (const char *, ...) __attribute__((format(gnu_attr_printf, 1, 2, "<%MdAd")));
+extern void printf3 (const char *, ...) __attribute__((format(gnu_attr_printf, 1, 2, 3))); /* { dg-error "invalid format character map" "bad format" } */
+extern void printf4 (const char *, ...) __attribute__((format(gnu_attr_printf, 1, 2, "ab3"))); /* { dg-error "invalid format character map" "bad format" } */
+extern void printf5 (const char *, ...) __attribute__((format(gnu_attr_printf, 1, 2, "ab3", 2))); /* { dg-error "wrong number of arguments" "bad format" } */
+/* printf6 has the same map as printf2 */
+extern void printf6 (const char *, ...) __attribute__((format(gnu_attr_printf, 1, 2, "<%MdAd")));
+/* printf7 has a different map */
+extern void printf7 (const char *, ...) __attribute__((format(gnu_attr_printf, 1, 2, "<%Md")));
+
+void
+foo (int i, int *ip, double d)
+{
+  printf("X");
+  printf1("X");
+  printf2("X");
+  printf6("X");
+  printf7("X");
+
+  printf("b%d", 2);
+  printf1("b%d", 2);
+  printf2("b%d", 2);
+  printf6("b%d", 2);
+  printf7("b%d", 2);
+
+  printf("b%<"); /* { dg-warning "unknown" "%<" } */
+  printf1("b%<"); /* { dg-warning "unknown" "%<" } */
+  printf2("b%<");
+  printf6("b%<");
+  printf7("b%<");
+
+  printf("b%M"); /* { dg-warning "unknown" "%M" } */
+  printf1("b%M"); /* { dg-warning "unknown" "%M" } */
+  printf2("b%M"); /* { dg-warning "expects" "%M" } */
+  printf6("b%M"); /* { dg-warning "expects" "%M" } */
+  printf7("b%M"); /* { dg-warning "expects" "%M" } */
+
+  printf("b%M", 2); /* { dg-warning "unknown||too many" "%M" } */
+  printf1("b%M", 2); /* { dg-warning "unknown||too many" "%M" } */
+  printf2("b%M", 2);
+  printf6("b%M", 2);
+  printf7("b%M", 2);
+
+  printf("b%M", "X"); /* { dg-warning "unknown||too many" "%M" } */
+  printf1("b%M", "X"); /* { dg-warning "unknown||too many" "%M" } */
+  printf2("b%M", "X"); /* { dg-warning "expects argument of type 'int" "%M" } */
+  printf6("b%M", "X"); /* { dg-warning "expects argument of type 'int" "%M" } */
+  printf7("b%M", "X"); /* { dg-warning "expects argument of type 'int" "%M" } */
+
+  printf("b%A", 2); /* { dg-warning "expects argument of type 'double" "%A" } */
+  printf1("b%A", 2); /* { dg-warning "expects argument of type 'double" "%A" } */
+  printf2("b%A", 2);
+  printf6("b%A", 2);
+  printf7("b%A", 2); /* { dg-warning "expects argument of type 'double" "%A" } */
+}