Patchwork ObjC parser/lexer tidyup patch

login
register
mail settings
Submitter Nicola Pero
Date Sept. 22, 2010, 11:36 p.m.
Message ID <1285198614.304312392@192.168.2.228>
Download mbox | patch
Permalink /patch/65476/
State New
Headers show

Comments

Nicola Pero - Sept. 22, 2010, 11:36 p.m.
This patch refactors some of the lex/parser code of Objective-C to make
it easier to change and update (and understand).

In particular it removes the objc_is_reserved_word() ObjC function, 
which bundled two completely different cases (OBJC_IS_AT_KEYWORD 
and OBJC_IS_CXX_KEYWORD, but confusingly *not* OBJC_IS_PQ_KEYWORD) 
together, making them difficult to distinguish.

Instead, it adds a new OBJC_IS_CXX_KEYWORD macro and clearly spells out
the various cases in c-parser.c (c_lex_one_token) and in c-lex.c.  I also 
added comments explaining the various cases and issues.

Everything works exactly as before, and there are no regressions.

I also added some more testcases to ObjC/ObjC++ to make sure I wasn't breaking
anything (and I won't be breaking anything in subsequent patches).

Ok to apply ?

Thanks

PS: This patch depends on Mike approving the previous "ObjC patch - do not replace 
token->value with canonical spelling"
Mike Stump - Sept. 28, 2010, 5:35 p.m.
On Sep 22, 2010, at 4:36 PM, Nicola Pero wrote:
> This patch refactors some of the lex/parser code of Objective-C to make
> it easier to change and update (and understand).

> Ok to apply ?

Ok.  My only concern would be, as we change trunk, applying patches from the apple-branch become harder...  Normally, I'd merge all, then cleanup, but since you guys are doing most of the work...  you can plan that accordingly.

> PS: This patch depends on Mike approving the previous "ObjC patch - do not replace 
> token->value with canonical spelling"

Re-ping anything outstanding...  I think I'm caught up on your patches....
Nicola Pero - Sept. 28, 2010, 5:43 p.m.
> Ok.  My only concern would be, as we change trunk, 
> applying patches from the apple-branch become harder...  

Yes, it's a very good concern, which I share. :-)

I have been trying to avoid making any changes to areas 
where there is still stuff to merge.

I only start changing once there isn't anything else to merge
[in the case of @encode and similar, I have already merged 
everything.  In the case of the C parser/lexer, it's impossible
to merge anyway since the apple branch uses the old parse.in 
parser ;-)]

Thanks

Patch

Index: gcc/c-family/ChangeLog
===================================================================
--- gcc/c-family/ChangeLog      (revision 164528)
+++ gcc/c-family/ChangeLog      (working copy)
@@ -1,3 +1,12 @@ 
+2010-09-22  Nicola Pero  <nicola.pero@meta-innovation.com>
+
+       * c-common.h (OBJC_IS_CXX_KEYWORD): New macro.  Updated comments.
+       (objc_is_reserved_word): Removed.
+       * c-common.c: Updated comments.
+       * c-lex.c (c_lex_with_flags): Use OBJC_IS_CXX_KEYWORD instead of
+       objc_is_reserved_word.
+       * stub-objc.c (objc_is_reserved_word): Removed.
+       
 2010-09-21  Nicola Pero  <nicola.pero@meta-innovation.com>
 
        PR objc/23710
Index: gcc/c-family/c-lex.c
===================================================================
--- gcc/c-family/c-lex.c        (revision 164528)
+++ gcc/c-family/c-lex.c        (working copy)
@@ -366,7 +366,8 @@  c_lex_with_flags (tree *value, location_t *loc, un
 
            case CPP_NAME:
              *value = HT_IDENT_TO_GCC_IDENT (HT_NODE (tok->val.node.node));
-             if (objc_is_reserved_word (*value))
+             if (OBJC_IS_AT_KEYWORD (C_RID_CODE (*value))
+                 || OBJC_IS_CXX_KEYWORD (C_RID_CODE (*value)))
                {
                  type = CPP_AT_NAME;
                  break;
Index: gcc/c-family/c-common.c
===================================================================
--- gcc/c-family/c-common.c     (revision 164528)
+++ gcc/c-family/c-common.c     (working copy)
@@ -379,8 +379,13 @@  static int resort_field_decl_cmp (const void *, co
    If -fno-asm is used, D_ASM is added to the mask.  If
    -fno-gnu-keywords is used, D_EXT is added.  If -fno-asm and C in
    C89 mode, D_EXT89 is added for both -fno-asm and -fno-gnu-keywords.
-   In C with -Wc++-compat, we warn if D_CXXWARN is set.  */
+   In C with -Wc++-compat, we warn if D_CXXWARN is set.
 
+   Note the complication of the D_CXX_OBJC keywords.  These are
+   reserved words such as 'class'.  In C++, 'class' is a reserved
+   word.  In Objective-C++ it is too.  In Objective-C, it is a
+   reserved word too, but only if it follows an '@' sign.
+*/
 const struct c_common_resword c_common_reswords[] =
 {
   { "_Bool",           RID_BOOL,      D_CONLY },
Index: gcc/c-family/c-common.h
===================================================================
--- gcc/c-family/c-common.h     (revision 164528)
+++ gcc/c-family/c-common.h     (working copy)
@@ -76,7 +76,8 @@  enum rid
   /* C++ */
   RID_FRIEND, RID_VIRTUAL, RID_EXPLICIT, RID_EXPORT, RID_MUTABLE,
 
-  /* ObjC */
+  /* ObjC ("PQ" reserved words - they do not appear after a '@' and
+     are keywords only in specific contexts)  */
   RID_IN, RID_OUT, RID_INOUT, RID_BYCOPY, RID_BYREF, RID_ONEWAY,
 
   /* C (reserved and imaginary types not implemented, so any use is a
@@ -105,7 +106,8 @@  enum rid
   /* Too many ways of getting the name of a function as a string */
   RID_FUNCTION_NAME, RID_PRETTY_FUNCTION_NAME, RID_C99_FUNCTION_NAME,
 
-  /* C++ */
+  /* C++ (some of these are keywords in Objective-C as well, but only
+     if they appear after a '@') */
   RID_BOOL,     RID_WCHAR,    RID_CLASS,
   RID_PUBLIC,   RID_PRIVATE,  RID_PROTECTED,
   RID_TEMPLATE, RID_NULL,     RID_CATCH,
@@ -133,7 +135,8 @@  enum rid
   /* C++0x */
   RID_CONSTEXPR, RID_DECLTYPE, RID_NOEXCEPT, RID_NULLPTR, RID_STATIC_ASSERT,
 
-  /* Objective-C */
+  /* Objective-C ("AT" reserved words - they are only keywords when
+     they follow '@')  */
   RID_AT_ENCODE,   RID_AT_END,
   RID_AT_CLASS,    RID_AT_ALIAS,     RID_AT_DEFS,
   RID_AT_PRIVATE,  RID_AT_PROTECTED, RID_AT_PUBLIC,
@@ -188,6 +191,18 @@  enum rid
   ((unsigned int) (rid) >= (unsigned int) RID_FIRST_PQ && \
    (unsigned int) (rid) <= (unsigned int) RID_LAST_PQ)
 
+/* OBJC_IS_CXX_KEYWORD recognizes the 'CXX_OBJC' keywords (such as
+   'class') which are shared in a subtle way between Objective-C and
+   C++.  When the lexer is lexing in Objective-C/Objective-C++, if it
+   finds '@' followed by one of these identifiers (eg, '@class'), it
+   recognizes the whole as an Objective-C keyword.  If the identifier
+   is found elsewhere, it follows the rules of the C/C++ language.
+ */
+#define OBJC_IS_CXX_KEYWORD(rid) \
+  (rid == RID_CLASS                                                    \
+   || rid == RID_PUBLIC || rid == RID_PROTECTED || rid == RID_PRIVATE  \
+   || rid == RID_TRY || rid == RID_THROW || rid == RID_CATCH)
+
 /* The elements of `ridpointers' are identifier nodes for the reserved
    type names and storage classes.  It is indexed by a RID_... value.  */
 extern GTY ((length ("(int) RID_MAX"))) tree *ridpointers;
@@ -940,7 +955,6 @@  extern void c_parse_error (const char *, enum cpp_
 extern tree objc_is_class_name (tree);
 extern tree objc_is_object_ptr (tree);
 extern void objc_check_decl (tree);
-extern int objc_is_reserved_word (tree);
 extern bool objc_compare_types (tree, tree, int, tree);
 extern void objc_volatilize_decl (tree);
 extern bool objc_type_quals_match (tree, tree);
Index: gcc/c-family/stub-objc.c
===================================================================
--- gcc/c-family/stub-objc.c    (revision 164528)
+++ gcc/c-family/stub-objc.c    (working copy)
@@ -56,12 +56,6 @@  objc_check_decl (tree ARG_UNUSED (decl))
 {
 }
 
-int
-objc_is_reserved_word (tree ARG_UNUSED (ident))
-{
-  return 0;
-}
-
 bool
 objc_compare_types (tree ARG_UNUSED (ltyp), tree ARG_UNUSED (rtyp),
                    int ARG_UNUSED (argno), tree ARG_UNUSED (callee))
Index: gcc/objc/objc-act.c
===================================================================
--- gcc/objc/objc-act.c (revision 164528)
+++ gcc/objc/objc-act.c (working copy)
@@ -824,20 +824,6 @@  objc_add_instance_variable (tree decl)
                                decl);
 }
 
-/* Return 1 if IDENT is an ObjC/ObjC++ reserved keyword in the context of
-   an '@'.  */
-
-int
-objc_is_reserved_word (tree ident)
-{
-  unsigned char code = C_RID_CODE (ident);
-
-  return (OBJC_IS_AT_KEYWORD (code)
-         || code == RID_CLASS || code == RID_PUBLIC
-         || code == RID_PROTECTED || code == RID_PRIVATE
-         || code == RID_TRY || code == RID_THROW || code == RID_CATCH);
-}
-
 /* Return true if TYPE is 'id'.  */
 
 static bool
Index: gcc/objc/ChangeLog
===================================================================
--- gcc/objc/ChangeLog  (revision 164528)
+++ gcc/objc/ChangeLog  (working copy)
@@ -1,3 +1,7 @@ 
+2010-09-23  Nicola Pero  <nicola.pero@meta-innovation.com>
+
+       * objc-act.c (objc_is_reserved_word): Removed.
+
 2010-09-21  Nicola Pero  <nicola.pero@meta-innovation.com>
 
        PR objc/23710
Index: gcc/ChangeLog
===================================================================
--- gcc/ChangeLog       (revision 164528)
+++ gcc/ChangeLog       (working copy)
@@ -1,3 +1,9 @@ 
+2010-09-22  Nicola Pero  <nicola.pero@meta-innovation.com>
+
+       * c-parser.c (c_lex_one_token): In Objective-C, when dealing with
+       a CPP_NAME which is a reserved word, clearly separate cases for
+       OBJC_IS_PQ_KEYWORD, OBJC_IS_AT_KEYWORD and OBJC_IS_CXX_KEYWORD.
+
 2010-09-22  Richard Guenther  <rguenther@suse.de>
 
        * tree-inline.c (optimize_inline_calls): Schedule cleanups
Index: gcc/testsuite/ChangeLog
===================================================================
--- gcc/testsuite/ChangeLog     (revision 164528)
+++ gcc/testsuite/ChangeLog     (working copy)
@@ -1,3 +1,11 @@ 
+2010-09-22  Nicola Pero  <nicola.pero@meta-innovation.com>
+
+       * objc.dg/keywords-1.m: New test.
+       * objc.dg/keywords-2.m: New test.
+       * objc.dg/keywords-3.m: New test.
+       * obj-c++.dg/keywords-1.mm: New test.
+       * obj-c++.dg/keywords-2.mm: New test.
+
 2010-09-22  Marcus Shawcroft  <marcus.shawcroft@arm.com>
 
        * lib/scanasm.exp(dg-function-on-line): Permit .fnstart to appear in
Index: gcc/testsuite/objc.dg/keywords-1.m
===================================================================
--- gcc/testsuite/objc.dg/keywords-1.m  (revision 0)
+++ gcc/testsuite/objc.dg/keywords-1.m  (revision 0)
@@ -0,0 +1,27 @@ 
+/* Test that 'in', 'out', 'inout', 'bycopy', 'byref', 'oneway'
+   are not keywords outside of a "protocol qualifier" context.
+*/
+/* { dg-do compile } */
+
+typedef int in;
+
+in out (in inout)
+{
+  int byref = inout * 2;
+  
+  return byref + inout;
+}
+
+@class byref;
+
+@interface inout
+@end
+
+@protocol oneway;
+
+int main (void)
+{
+  in bycopy = (in)(out (0));
+
+  return (in)bycopy;
+}
Index: gcc/testsuite/objc.dg/keywords-2.m
===================================================================
--- gcc/testsuite/objc.dg/keywords-2.m  (revision 0)
+++ gcc/testsuite/objc.dg/keywords-2.m  (revision 0)
@@ -0,0 +1,24 @@ 
+/* Test that 'encode', 'end', 'compatibility_alias', 'defs',
+   'protocol', 'selector', finally', 'synchronized', 'interface',
+   'implementation' are not keywords if not after a '@'.
+*/
+/* { dg-do compile } */
+
+int encode (int end)
+{
+  int compatibility_alias = end * 2;
+  int defs = compatibility_alias * 2;
+  int protocol = defs * 2;
+  int selector = protocol * 2;
+  int finally = selector * 2;
+  int synchronized = finally * 2;
+  int interface = synchronized * 2;
+  int implementation = interface * 2;
+
+  return implementation;
+}
+
+int main (void)
+{
+  return encode (0);
+}
Index: gcc/testsuite/objc.dg/keywords-3.m
===================================================================
--- gcc/testsuite/objc.dg/keywords-3.m  (revision 0)
+++ gcc/testsuite/objc.dg/keywords-3.m  (revision 0)
@@ -0,0 +1,20 @@ 
+/* Test that 'class', 'public', 'private', protected', 'try', 'catch',
+   'throw' are not keywords in pure Objective-C if not after a '@'.
+*/
+/* { dg-do compile } */
+
+int class (int public)
+{
+  int private = public;
+  int protected = private * 2;
+  int try = protected * 2;
+  int catch = try * 2;
+  int throw = catch * 2;
+
+  return throw;
+}
+
+int main (void)
+{
+  return class (0);
+}
Index: gcc/testsuite/obj-c++.dg/keywords-2.mm
===================================================================
--- gcc/testsuite/obj-c++.dg/keywords-2.mm      (revision 0)
+++ gcc/testsuite/obj-c++.dg/keywords-2.mm      (revision 0)
@@ -0,0 +1,24 @@ 
+/* Test that 'encode', 'end', 'compatibility_alias', 'defs',
+   'protocol', 'selector', finally', 'synchronized', 'interface',
+   'implementation' are not keywords if not after a '@'.
+*/
+/* { dg-do compile } */
+
+int encode (int end)
+{
+  int compatibility_alias = end * 2;
+  int defs = compatibility_alias * 2;
+  int protocol = defs * 2;
+  int selector = protocol * 2;
+  int finally = selector * 2;
+  int synchronized = finally * 2;
+  int interface = synchronized * 2;
+  int implementation = interface * 2;
+
+  return implementation;
+}
+
+int main (void)
+{
+  return encode (0);
+}
Index: gcc/testsuite/obj-c++.dg/keywords-1.mm
===================================================================
--- gcc/testsuite/obj-c++.dg/keywords-1.mm      (revision 0)
+++ gcc/testsuite/obj-c++.dg/keywords-1.mm      (revision 0)
@@ -0,0 +1,27 @@ 
+/* Test that 'in', 'out', 'inout', 'bycopy', 'byref', 'oneway'
+   are not keywords outside of a "protocol qualifier" context.
+*/
+/* { dg-do compile } */
+
+typedef int in;
+
+in out (in inout)
+{
+  int byref = inout * 2;
+  
+  return byref + inout;
+}
+
+@class byref;
+
+@interface inout
+typedef int in;
+
+in out (in inout)
+{
+  int byref = inout * 2;
+  
+  return byref + inout;
+}
+
+@class byref;
+
+@interface inout
+@end
+
+@protocol oneway;
+
+int main (void)
+{
+  in bycopy = (in)(out (0));
+
+  return (in)bycopy;
+}
Index: gcc/c-parser.c
===================================================================
--- gcc/c-parser.c      (revision 164528)
+++ gcc/c-parser.c      (working copy)
@@ -179,7 +179,11 @@  typedef struct GTY(()) c_parser {
   BOOL_BITFIELD in_if_block : 1;
   /* True if we want to lex an untranslated string.  */
   BOOL_BITFIELD lex_untranslated_string : 1;
+
   /* Objective-C specific parser/lexer information.  */
+
+  /* True if we are in a context where the Objective-C "PQ" keywords
+     are considered keywords.  */
   BOOL_BITFIELD objc_pq_context : 1;
   /* The following flag is needed to contextualize Objective-C lexical
      analysis.  In some cases (e.g., 'int NSObject;'), it is
@@ -236,19 +240,38 @@  c_lex_one_token (c_parser *parser, c_token *token)
                token->keyword = rid_code;
                break;
              }
-           else if (c_dialect_objc ())
+           else if (c_dialect_objc () && OBJC_IS_PQ_KEYWORD (rid_code))
              {
-               if (!objc_is_reserved_word (token->value)
-                   && (!OBJC_IS_PQ_KEYWORD (rid_code)
-                       || parser->objc_pq_context))
+               /* We found an Objective-C "pq" keyword (in, out,
+                  inout, bycopy, byref, oneway).  They need special
+                  care because the interpretation depends on the
+                  context.
+                */
+               if (parser->objc_pq_context)
                  {
-                   /* Return the canonical spelling for this keyword.  */
-                   token->value = ridpointers[(int) rid_code];
                    token->type = CPP_KEYWORD;
                    token->keyword = rid_code;
                    break;
                  }
+               /* Else, "pq" keywords outside of the "pq" context are
+                  not keywords, and we fall through to the code for
+                  normal tokens.
+               */
              }
+           else if (c_dialect_objc () 
+                    && (OBJC_IS_AT_KEYWORD (rid_code)
+                        || OBJC_IS_CXX_KEYWORD (rid_code)))
+             {
+               /* We found one of the Objective-C "@" keywords (defs,
+                  selector, synchronized, etc) or one of the
+                  Objective-C "cxx" keywords (class, private,
+                  protected, public, try, catch, throw) without a
+                  preceding '@' sign.  Do nothing and fall through to
+                  the code for normal tokens (in C++ we would still
+                  consider the CXX ones keywords, but not in C).
+               */
+               ;
+             }
            else
              {
                token->type = CPP_KEYWORD;