From patchwork Wed Sep 22 23:36:54 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicola Pero X-Patchwork-Id: 65476 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id BC431B6EFF for ; Thu, 23 Sep 2010 09:37:09 +1000 (EST) Received: (qmail 9441 invoked by alias); 22 Sep 2010 23:37:05 -0000 Received: (qmail 9425 invoked by uid 22791); 22 Sep 2010 23:37:02 -0000 X-SWARE-Spam-Status: No, hits=-1.5 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, TW_BJ X-Spam-Check-By: sourceware.org Received: from smtp121.iad.emailsrvr.com (HELO smtp121.iad.emailsrvr.com) (207.97.245.121) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 22 Sep 2010 23:36:56 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp42.relay.iad1a.emailsrvr.com (SMTP Server) with ESMTP id 74F6214833D for ; Wed, 22 Sep 2010 19:36:54 -0400 (EDT) Received: from dynamic11.wm-web.iad.mlsrvr.com (dynamic11.wm-web.iad1a.rsapps.net [192.168.2.218]) by smtp42.relay.iad1a.emailsrvr.com (SMTP Server) with ESMTP id 5B8841481FB for ; Wed, 22 Sep 2010 19:36:54 -0400 (EDT) Received: from meta-innovation.com (localhost [127.0.0.1]) by dynamic11.wm-web.iad.mlsrvr.com (Postfix) with ESMTP id 4A9ADE0086 for ; Wed, 22 Sep 2010 19:36:54 -0400 (EDT) Received: by www2.webmail.us (Authenticated sender: nicola.pero@meta-innovation.com, from: nicola.pero@meta-innovation.com) with HTTP; Thu, 23 Sep 2010 01:36:54 +0200 (CEST) Date: Thu, 23 Sep 2010 01:36:54 +0200 (CEST) Subject: ObjC parser/lexer tidyup patch From: "Nicola Pero" To: gcc-patches@gcc.gnu.org MIME-Version: 1.0 X-Type: plain Message-ID: <1285198614.304312392@192.168.2.228> X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org This patch refactors some of the lex/parser code of Objective-C to make it easier to change and update (and understand). In particular it removes the objc_is_reserved_word() ObjC function, which bundled two completely different cases (OBJC_IS_AT_KEYWORD and OBJC_IS_CXX_KEYWORD, but confusingly *not* OBJC_IS_PQ_KEYWORD) together, making them difficult to distinguish. Instead, it adds a new OBJC_IS_CXX_KEYWORD macro and clearly spells out the various cases in c-parser.c (c_lex_one_token) and in c-lex.c. I also added comments explaining the various cases and issues. Everything works exactly as before, and there are no regressions. I also added some more testcases to ObjC/ObjC++ to make sure I wasn't breaking anything (and I won't be breaking anything in subsequent patches). Ok to apply ? Thanks PS: This patch depends on Mike approving the previous "ObjC patch - do not replace token->value with canonical spelling" Index: gcc/c-family/ChangeLog =================================================================== --- gcc/c-family/ChangeLog (revision 164528) +++ gcc/c-family/ChangeLog (working copy) @@ -1,3 +1,12 @@ +2010-09-22 Nicola Pero + + * c-common.h (OBJC_IS_CXX_KEYWORD): New macro. Updated comments. + (objc_is_reserved_word): Removed. + * c-common.c: Updated comments. + * c-lex.c (c_lex_with_flags): Use OBJC_IS_CXX_KEYWORD instead of + objc_is_reserved_word. + * stub-objc.c (objc_is_reserved_word): Removed. + 2010-09-21 Nicola Pero PR objc/23710 Index: gcc/c-family/c-lex.c =================================================================== --- gcc/c-family/c-lex.c (revision 164528) +++ gcc/c-family/c-lex.c (working copy) @@ -366,7 +366,8 @@ c_lex_with_flags (tree *value, location_t *loc, un case CPP_NAME: *value = HT_IDENT_TO_GCC_IDENT (HT_NODE (tok->val.node.node)); - if (objc_is_reserved_word (*value)) + if (OBJC_IS_AT_KEYWORD (C_RID_CODE (*value)) + || OBJC_IS_CXX_KEYWORD (C_RID_CODE (*value))) { type = CPP_AT_NAME; break; Index: gcc/c-family/c-common.c =================================================================== --- gcc/c-family/c-common.c (revision 164528) +++ gcc/c-family/c-common.c (working copy) @@ -379,8 +379,13 @@ static int resort_field_decl_cmp (const void *, co If -fno-asm is used, D_ASM is added to the mask. If -fno-gnu-keywords is used, D_EXT is added. If -fno-asm and C in C89 mode, D_EXT89 is added for both -fno-asm and -fno-gnu-keywords. - In C with -Wc++-compat, we warn if D_CXXWARN is set. */ + In C with -Wc++-compat, we warn if D_CXXWARN is set. + Note the complication of the D_CXX_OBJC keywords. These are + reserved words such as 'class'. In C++, 'class' is a reserved + word. In Objective-C++ it is too. In Objective-C, it is a + reserved word too, but only if it follows an '@' sign. +*/ const struct c_common_resword c_common_reswords[] = { { "_Bool", RID_BOOL, D_CONLY }, Index: gcc/c-family/c-common.h =================================================================== --- gcc/c-family/c-common.h (revision 164528) +++ gcc/c-family/c-common.h (working copy) @@ -76,7 +76,8 @@ enum rid /* C++ */ RID_FRIEND, RID_VIRTUAL, RID_EXPLICIT, RID_EXPORT, RID_MUTABLE, - /* ObjC */ + /* ObjC ("PQ" reserved words - they do not appear after a '@' and + are keywords only in specific contexts) */ RID_IN, RID_OUT, RID_INOUT, RID_BYCOPY, RID_BYREF, RID_ONEWAY, /* C (reserved and imaginary types not implemented, so any use is a @@ -105,7 +106,8 @@ enum rid /* Too many ways of getting the name of a function as a string */ RID_FUNCTION_NAME, RID_PRETTY_FUNCTION_NAME, RID_C99_FUNCTION_NAME, - /* C++ */ + /* C++ (some of these are keywords in Objective-C as well, but only + if they appear after a '@') */ RID_BOOL, RID_WCHAR, RID_CLASS, RID_PUBLIC, RID_PRIVATE, RID_PROTECTED, RID_TEMPLATE, RID_NULL, RID_CATCH, @@ -133,7 +135,8 @@ enum rid /* C++0x */ RID_CONSTEXPR, RID_DECLTYPE, RID_NOEXCEPT, RID_NULLPTR, RID_STATIC_ASSERT, - /* Objective-C */ + /* Objective-C ("AT" reserved words - they are only keywords when + they follow '@') */ RID_AT_ENCODE, RID_AT_END, RID_AT_CLASS, RID_AT_ALIAS, RID_AT_DEFS, RID_AT_PRIVATE, RID_AT_PROTECTED, RID_AT_PUBLIC, @@ -188,6 +191,18 @@ enum rid ((unsigned int) (rid) >= (unsigned int) RID_FIRST_PQ && \ (unsigned int) (rid) <= (unsigned int) RID_LAST_PQ) +/* OBJC_IS_CXX_KEYWORD recognizes the 'CXX_OBJC' keywords (such as + 'class') which are shared in a subtle way between Objective-C and + C++. When the lexer is lexing in Objective-C/Objective-C++, if it + finds '@' followed by one of these identifiers (eg, '@class'), it + recognizes the whole as an Objective-C keyword. If the identifier + is found elsewhere, it follows the rules of the C/C++ language. + */ +#define OBJC_IS_CXX_KEYWORD(rid) \ + (rid == RID_CLASS \ + || rid == RID_PUBLIC || rid == RID_PROTECTED || rid == RID_PRIVATE \ + || rid == RID_TRY || rid == RID_THROW || rid == RID_CATCH) + /* The elements of `ridpointers' are identifier nodes for the reserved type names and storage classes. It is indexed by a RID_... value. */ extern GTY ((length ("(int) RID_MAX"))) tree *ridpointers; @@ -940,7 +955,6 @@ extern void c_parse_error (const char *, enum cpp_ extern tree objc_is_class_name (tree); extern tree objc_is_object_ptr (tree); extern void objc_check_decl (tree); -extern int objc_is_reserved_word (tree); extern bool objc_compare_types (tree, tree, int, tree); extern void objc_volatilize_decl (tree); extern bool objc_type_quals_match (tree, tree); Index: gcc/c-family/stub-objc.c =================================================================== --- gcc/c-family/stub-objc.c (revision 164528) +++ gcc/c-family/stub-objc.c (working copy) @@ -56,12 +56,6 @@ objc_check_decl (tree ARG_UNUSED (decl)) { } -int -objc_is_reserved_word (tree ARG_UNUSED (ident)) -{ - return 0; -} - bool objc_compare_types (tree ARG_UNUSED (ltyp), tree ARG_UNUSED (rtyp), int ARG_UNUSED (argno), tree ARG_UNUSED (callee)) Index: gcc/objc/objc-act.c =================================================================== --- gcc/objc/objc-act.c (revision 164528) +++ gcc/objc/objc-act.c (working copy) @@ -824,20 +824,6 @@ objc_add_instance_variable (tree decl) decl); } -/* Return 1 if IDENT is an ObjC/ObjC++ reserved keyword in the context of - an '@'. */ - -int -objc_is_reserved_word (tree ident) -{ - unsigned char code = C_RID_CODE (ident); - - return (OBJC_IS_AT_KEYWORD (code) - || code == RID_CLASS || code == RID_PUBLIC - || code == RID_PROTECTED || code == RID_PRIVATE - || code == RID_TRY || code == RID_THROW || code == RID_CATCH); -} - /* Return true if TYPE is 'id'. */ static bool Index: gcc/objc/ChangeLog =================================================================== --- gcc/objc/ChangeLog (revision 164528) +++ gcc/objc/ChangeLog (working copy) @@ -1,3 +1,7 @@ +2010-09-23 Nicola Pero + + * objc-act.c (objc_is_reserved_word): Removed. + 2010-09-21 Nicola Pero PR objc/23710 Index: gcc/ChangeLog =================================================================== --- gcc/ChangeLog (revision 164528) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,9 @@ +2010-09-22 Nicola Pero + + * c-parser.c (c_lex_one_token): In Objective-C, when dealing with + a CPP_NAME which is a reserved word, clearly separate cases for + OBJC_IS_PQ_KEYWORD, OBJC_IS_AT_KEYWORD and OBJC_IS_CXX_KEYWORD. + 2010-09-22 Richard Guenther * tree-inline.c (optimize_inline_calls): Schedule cleanups Index: gcc/testsuite/ChangeLog =================================================================== --- gcc/testsuite/ChangeLog (revision 164528) +++ gcc/testsuite/ChangeLog (working copy) @@ -1,3 +1,11 @@ +2010-09-22 Nicola Pero + + * objc.dg/keywords-1.m: New test. + * objc.dg/keywords-2.m: New test. + * objc.dg/keywords-3.m: New test. + * obj-c++.dg/keywords-1.mm: New test. + * obj-c++.dg/keywords-2.mm: New test. + 2010-09-22 Marcus Shawcroft * lib/scanasm.exp(dg-function-on-line): Permit .fnstart to appear in Index: gcc/testsuite/objc.dg/keywords-1.m =================================================================== --- gcc/testsuite/objc.dg/keywords-1.m (revision 0) +++ gcc/testsuite/objc.dg/keywords-1.m (revision 0) @@ -0,0 +1,27 @@ +/* Test that 'in', 'out', 'inout', 'bycopy', 'byref', 'oneway' + are not keywords outside of a "protocol qualifier" context. +*/ +/* { dg-do compile } */ + +typedef int in; + +in out (in inout) +{ + int byref = inout * 2; + + return byref + inout; +} + +@class byref; + +@interface inout +@end + +@protocol oneway; + +int main (void) +{ + in bycopy = (in)(out (0)); + + return (in)bycopy; +} Index: gcc/testsuite/objc.dg/keywords-2.m =================================================================== --- gcc/testsuite/objc.dg/keywords-2.m (revision 0) +++ gcc/testsuite/objc.dg/keywords-2.m (revision 0) @@ -0,0 +1,24 @@ +/* Test that 'encode', 'end', 'compatibility_alias', 'defs', + 'protocol', 'selector', finally', 'synchronized', 'interface', + 'implementation' are not keywords if not after a '@'. +*/ +/* { dg-do compile } */ + +int encode (int end) +{ + int compatibility_alias = end * 2; + int defs = compatibility_alias * 2; + int protocol = defs * 2; + int selector = protocol * 2; + int finally = selector * 2; + int synchronized = finally * 2; + int interface = synchronized * 2; + int implementation = interface * 2; + + return implementation; +} + +int main (void) +{ + return encode (0); +} Index: gcc/testsuite/objc.dg/keywords-3.m =================================================================== --- gcc/testsuite/objc.dg/keywords-3.m (revision 0) +++ gcc/testsuite/objc.dg/keywords-3.m (revision 0) @@ -0,0 +1,20 @@ +/* Test that 'class', 'public', 'private', protected', 'try', 'catch', + 'throw' are not keywords in pure Objective-C if not after a '@'. +*/ +/* { dg-do compile } */ + +int class (int public) +{ + int private = public; + int protected = private * 2; + int try = protected * 2; + int catch = try * 2; + int throw = catch * 2; + + return throw; +} + +int main (void) +{ + return class (0); +} Index: gcc/testsuite/obj-c++.dg/keywords-2.mm =================================================================== --- gcc/testsuite/obj-c++.dg/keywords-2.mm (revision 0) +++ gcc/testsuite/obj-c++.dg/keywords-2.mm (revision 0) @@ -0,0 +1,24 @@ +/* Test that 'encode', 'end', 'compatibility_alias', 'defs', + 'protocol', 'selector', finally', 'synchronized', 'interface', + 'implementation' are not keywords if not after a '@'. +*/ +/* { dg-do compile } */ + +int encode (int end) +{ + int compatibility_alias = end * 2; + int defs = compatibility_alias * 2; + int protocol = defs * 2; + int selector = protocol * 2; + int finally = selector * 2; + int synchronized = finally * 2; + int interface = synchronized * 2; + int implementation = interface * 2; + + return implementation; +} + +int main (void) +{ + return encode (0); +} Index: gcc/testsuite/obj-c++.dg/keywords-1.mm =================================================================== --- gcc/testsuite/obj-c++.dg/keywords-1.mm (revision 0) +++ gcc/testsuite/obj-c++.dg/keywords-1.mm (revision 0) @@ -0,0 +1,27 @@ +/* Test that 'in', 'out', 'inout', 'bycopy', 'byref', 'oneway' + are not keywords outside of a "protocol qualifier" context. +*/ +/* { dg-do compile } */ + +typedef int in; + +in out (in inout) +{ + int byref = inout * 2; + + return byref + inout; +} + +@class byref; + +@interface inout +typedef int in; + +in out (in inout) +{ + int byref = inout * 2; + + return byref + inout; +} + +@class byref; + +@interface inout +@end + +@protocol oneway; + +int main (void) +{ + in bycopy = (in)(out (0)); + + return (in)bycopy; +} Index: gcc/c-parser.c =================================================================== --- gcc/c-parser.c (revision 164528) +++ gcc/c-parser.c (working copy) @@ -179,7 +179,11 @@ typedef struct GTY(()) c_parser { BOOL_BITFIELD in_if_block : 1; /* True if we want to lex an untranslated string. */ BOOL_BITFIELD lex_untranslated_string : 1; + /* Objective-C specific parser/lexer information. */ + + /* True if we are in a context where the Objective-C "PQ" keywords + are considered keywords. */ BOOL_BITFIELD objc_pq_context : 1; /* The following flag is needed to contextualize Objective-C lexical analysis. In some cases (e.g., 'int NSObject;'), it is @@ -236,19 +240,38 @@ c_lex_one_token (c_parser *parser, c_token *token) token->keyword = rid_code; break; } - else if (c_dialect_objc ()) + else if (c_dialect_objc () && OBJC_IS_PQ_KEYWORD (rid_code)) { - if (!objc_is_reserved_word (token->value) - && (!OBJC_IS_PQ_KEYWORD (rid_code) - || parser->objc_pq_context)) + /* We found an Objective-C "pq" keyword (in, out, + inout, bycopy, byref, oneway). They need special + care because the interpretation depends on the + context. + */ + if (parser->objc_pq_context) { - /* Return the canonical spelling for this keyword. */ - token->value = ridpointers[(int) rid_code]; token->type = CPP_KEYWORD; token->keyword = rid_code; break; } + /* Else, "pq" keywords outside of the "pq" context are + not keywords, and we fall through to the code for + normal tokens. + */ } + else if (c_dialect_objc () + && (OBJC_IS_AT_KEYWORD (rid_code) + || OBJC_IS_CXX_KEYWORD (rid_code))) + { + /* We found one of the Objective-C "@" keywords (defs, + selector, synchronized, etc) or one of the + Objective-C "cxx" keywords (class, private, + protected, public, try, catch, throw) without a + preceding '@' sign. Do nothing and fall through to + the code for normal tokens (in C++ we would still + consider the CXX ones keywords, but not in C). + */ + ; + } else { token->type = CPP_KEYWORD;