From patchwork Mon Nov 22 18:07:50 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicola Pero X-Patchwork-Id: 72569 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 52036B70F1 for ; Tue, 23 Nov 2010 05:11:46 +1100 (EST) Received: (qmail 28580 invoked by alias); 22 Nov 2010 18:11:38 -0000 Received: (qmail 28264 invoked by uid 22791); 22 Nov 2010 18:11:36 -0000 X-SWARE-Spam-Status: No, hits=-1.3 required=5.0 tests=AWL, BAYES_00, TW_BJ, TW_LV, TW_ZJ, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from fencepost.gnu.org (HELO fencepost.gnu.org) (140.186.70.10) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 22 Nov 2010 18:11:28 +0000 Received: from eggs.gnu.org ([140.186.70.92]:42209) by fencepost.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.69) (envelope-from ) id 1PKarM-0003vf-9Z for gcc-patches@gnu.org; Mon, 22 Nov 2010 13:11:24 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PKarM-0003nL-9J for gcc-patches@gnu.org; Mon, 22 Nov 2010 13:11:25 -0500 Received: from smtp181.iad.emailsrvr.com ([207.97.245.181]:45812) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PKarM-0003nA-2q for gcc-patches@gnu.org; Mon, 22 Nov 2010 13:11:24 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp48.relay.iad1a.emailsrvr.com (SMTP Server) with ESMTP id 1335B168395; Mon, 22 Nov 2010 13:11:23 -0500 (EST) Received: from dynamic12.wm-web.iad.mlsrvr.com (dynamic12.wm-web.iad1a.rsapps.net [192.168.2.219]) by smtp48.relay.iad1a.emailsrvr.com (SMTP Server) with ESMTP id EE9CC16855E; Mon, 22 Nov 2010 13:07:50 -0500 (EST) Received: from meta-innovation.com (localhost [127.0.0.1]) by dynamic12.wm-web.iad.mlsrvr.com (Postfix) with ESMTP id C99FF2168090; Mon, 22 Nov 2010 13:07:50 -0500 (EST) Received: by www2.webmail.us (Authenticated sender: nicola.pero@meta-innovation.com, from: nicola.pero@meta-innovation.com) with HTTP; Mon, 22 Nov 2010 19:07:50 +0100 (CET) Date: Mon, 22 Nov 2010 19:07:50 +0100 (CET) Subject: Re: Fix for PR objc/34033 "compiler accepts invalid string concatenation" From: "Nicola Pero" To: "Paolo Bonzini" Cc: "Mike Stump" , "gcc-patches@gnu.org" MIME-Version: 1.0 X-Type: plain Message-ID: <1290449270.824332704@192.168.2.227> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org >> Does clang or Apple's gcc accept concat with c strings and objc >> string? > > Assuming Nicola followed their lead, no. But I was thinking the same > when looking at the code and testcase. > > Actually, I would allow only concat-ing C to ObjC, like @"ab" "cd". I > don't think "cd" @"ab" should be allowed. Excellent comments, thanks to both of you :-) gcc-llvm does the same as GCC 4.6 used to do, but clang indeed does the same that you both suggest; it accepts @"ab" "cd" but rejects "ab" @"cd". Below, a revised patch which implements this behaviour. Summarizing: * the first string must start with '@' to trigger an Objective-C string concat * the other strings can either have one '@' or zero '@'; it doesn't really matter, as they're all concat-ed together into a final CPP_OBJC_STRING * there should be no '@' with no following string at the end of the concat (eg, NSString *test = @"test"@; is invalid because the last '@' makes no sense) Ok to commit ? Thanks PS: I had not paid enough attention to clang (I was mostly looking at the llvm-gcc testsuite, which has got lots of testcases; clang doesn't have as many). I'll pay more attention in the future. Index: gcc/c-family/ChangeLog =================================================================== --- gcc/c-family/ChangeLog (revision 167002) +++ gcc/c-family/ChangeLog (working copy) @@ -1,3 +1,9 @@ +2010-11-22 Nicola Pero + + PR objc/34033 + * c-lex.c (lex_string): Check that each Objective-C string in a + string concatenation sequence starts with either one or zero '@'. + 2010-11-20 Joseph Myers * c-pragma.c: Remove conditionals on HANDLE_PRAGMA_PACK, Index: gcc/c-family/c-lex.c =================================================================== --- gcc/c-family/c-lex.c (revision 167002) +++ gcc/c-family/c-lex.c (working copy) @@ -889,10 +889,12 @@ interpret_fixed (const cpp_token *token, /* Convert a series of STRING, WSTRING, STRING16, STRING32 and/or UTF8STRING tokens into a tree, performing string constant - concatenation. TOK is the first of these. VALP is the location - to write the string into. OBJC_STRING indicates whether an '@' token - preceded the incoming token. - Returns the CPP token type of the result (CPP_STRING, CPP_WSTRING, + concatenation. TOK is the first of these. VALP is the location to + write the string into. OBJC_STRING indicates whether an '@' token + preceded the incoming token (in that case, the strings can either + be ObjC strings, preceded by a single '@', or normal strings, not + preceded by '@'. The result will be a CPP_OBJC_STRING). Returns + the CPP token type of the result (CPP_STRING, CPP_WSTRING, CPP_STRING32, CPP_STRING16, CPP_UTF8STRING, or CPP_OBJC_STRING). This is unfortunately more work than it should be. If any of the @@ -918,6 +920,12 @@ lex_string (const cpp_token *tok, tree * cpp_string str = tok->val.str; cpp_string *strs = &str; + /* objc_at_sign_was_seen is only used when doing Objective-C string + concatenation. It is 'true' if we have seen an '@' before the + current string, and 'false' if not. We must see exactly one or + zero '@' before each string. */ + bool objc_at_sign_was_seen = false; + retry: tok = cpp_get_token (parse_in); switch (tok->type) @@ -925,9 +933,12 @@ lex_string (const cpp_token *tok, tree * case CPP_PADDING: goto retry; case CPP_ATSIGN: - if (c_dialect_objc ()) + if (objc_string) { - objc_string = true; + if (objc_at_sign_was_seen) + error ("repeated %<@%> before Objective-C string"); + + objc_at_sign_was_seen = true; goto retry; } /* FALLTHROUGH */ @@ -956,9 +967,15 @@ lex_string (const cpp_token *tok, tree * concats++; obstack_grow (&str_ob, &tok->val.str, sizeof (cpp_string)); + if (objc_string) + objc_at_sign_was_seen = false; goto retry; } + /* It is an error if we saw a '@' with no following string. */ + if (objc_at_sign_was_seen) + error ("stray %<@%> in program"); + /* We have read one more token than we want. */ _cpp_backup_tokens (parse_in, 1); if (concats) Index: gcc/testsuite/ChangeLog =================================================================== --- gcc/testsuite/ChangeLog (revision 167002) +++ gcc/testsuite/ChangeLog (working copy) @@ -1,3 +1,11 @@ +2010-11-22 Nicola Pero + + PR objc/34033 + * objc.dg/strings-1.m: New. + * objc.dg/strings-2.m: New. + * obj-c++.dg/strings-1.mm: New. + * obj-c++.dg/strings-2.mm: New. + 2010-11-21 Uros Bizjak PR target/46533 Index: gcc/testsuite/objc.dg/strings-1.m =================================================================== --- gcc/testsuite/objc.dg/strings-1.m (revision 0) +++ gcc/testsuite/objc.dg/strings-1.m (revision 0) @@ -0,0 +1,33 @@ +/* Contributed by Nicola Pero , November 2010. */ +/* { dg-do compile } */ + +#include "../objc-obj-c++-shared/Object1.h" +#include "../objc-obj-c++-shared/next-mapping.h" +#ifndef __NEXT_RUNTIME__ +#include +#endif + +/* The following are correct. */ +id test_valid1 = @"test"; +id test_valid2 = @"te" @"st"; +id test_valid3 = @"te" @"s" @"t"; +id test_valid4 = @ "t" @ "e" @ "s" @ "t"; + +/* The following are accepted too; you can concat an ObjC string to a + C string, the result being an ObjC string. */ +id test_valid5 = @"te" "st"; +id test_valid6 = @"te" "s" @"t"; +id test_valid7 = @"te" @"s" "t"; + +/* The following are not correct. */ +id test_invalid1 = @@"test"; /* { dg-error "stray .@. in program" } */ +const char *test_invalid2 = "test"@; /* { dg-error "stray .@. in program" } */ +const char *test_invalid3 = "test"@@; /* { dg-error "stray .@. in program" } */ +const char *test_invalid4 = "te" @"st"; /* { dg-error "expected" } */ +id test_invalid5 = @"te" @@"st"; /* { dg-error "repeated .@. before Objective-C string" } */ +id test_invalid6 = @@"te" @"st"; /* { dg-error "stray .@. in program" } */ +id test_invalid7 = @"te" @"s" @@"t"; /* { dg-error "repeated .@. before Objective-C string" } */ +id test_invalid8 = @"te" @@"s" @"t"; /* { dg-error "repeated .@. before Objective-C string" } */ +id test_invalid9 = @"te" @"s" @"t" @; /* { dg-error "stray .@. in program" } */ +id test_invalidA = @"te" @ st; /* { dg-error "stray .@. in program" } */ + /* { dg-error "expected" "" { target *-*-* } 32 } */ Index: gcc/testsuite/objc.dg/strings-2.m =================================================================== --- gcc/testsuite/objc.dg/strings-2.m (revision 0) +++ gcc/testsuite/objc.dg/strings-2.m (revision 0) @@ -0,0 +1,51 @@ +/* Contributed by Nicola Pero , November 2010. */ + +/* { dg-do run } */ +/* { dg-xfail-run-if "Needs OBJC2 ABI" { *-*-darwin* && { lp64 && { ! objc2 } } } { "-fnext-runtime" } { "" } } */ +/* { dg-options "-fconstant-string-class=MyTestString" } */ +/* { dg-options "-mno-constant-cfstrings -fconstant-string-class=MyTestString" { target *-*-darwin* } } */ + +/* { dg-additional-sources "../objc-obj-c++-shared/Object1.m" } */ + +#include "../objc-obj-c++-shared/Object1.h" +#include "../objc-obj-c++-shared/next-mapping.h" + +#include /* For abort() */ + +@interface MyTestString : Object +{ + char *string; + unsigned int len; +} +/* All strings should contain the C string 'test'. Call -check to + test that this is true. */ +- (void) check; +@end + +@implementation MyTestString +- (void) check +{ + if (len != 4 || string[0] != 't' || string[1] != 'e' + || string[2] != 's' || string[3] != 't' || string[4] != '\0') + abort (); +} +@end + +int main (void) +{ + MyTestString *test_valid1 = @"test"; + MyTestString *test_valid2 = @"te" @"st"; + MyTestString *test_valid3 = @"te" @"s" @"t"; + MyTestString *test_valid4 = @ "t" @ "e" @ "s" @ "t"; + MyTestString *test_valid5 = @ "t" "e" "s" "t"; + MyTestString *test_valid6 = @ "t" "e" "s" @ "t"; + + [test_valid1 check]; + [test_valid2 check]; + [test_valid3 check]; + [test_valid4 check]; + [test_valid5 check]; + [test_valid6 check]; + + return 0; +} Index: gcc/testsuite/obj-c++.dg/strings-1.mm =================================================================== --- gcc/testsuite/obj-c++.dg/strings-1.mm (revision 0) +++ gcc/testsuite/obj-c++.dg/strings-1.mm (revision 0) @@ -0,0 +1,33 @@ +/* Contributed by Nicola Pero , November 2010. */ +/* { dg-do compile } */ + +#include "../objc-obj-c++-shared/Object1.h" +#include "../objc-obj-c++-shared/next-mapping.h" +#ifndef __NEXT_RUNTIME__ +#include +#endif + +/* The following are correct. */ +id test_valid1 = @"test"; +id test_valid2 = @"te" @"st"; +id test_valid3 = @"te" @"s" @"t"; +id test_valid4 = @ "t" @ "e" @ "s" @ "t"; + +/* The following are accepted too; you can concat an ObjC string to a + C string, the result being an ObjC string. */ +id test_valid5 = @"te" "st"; +id test_valid6 = @"te" "s" @"t"; +id test_valid7 = @"te" @"s" "t"; + +/* The following are not correct. */ +id test_invalid1 = @@"test"; /* { dg-error "stray .@. in program" } */ +const char *test_invalid2 = "test"@; /* { dg-error "stray .@. in program" } */ +const char *test_invalid3 = "test"@@; /* { dg-error "stray .@. in program" } */ +const char *test_invalid4 = "te" @"st"; /* { dg-error "expected" } */ +id test_invalid5 = @"te" @@"st"; /* { dg-error "repeated .@. before Objective-C string" } */ +id test_invalid6 = @@"te" @"st"; /* { dg-error "stray .@. in program" } */ +id test_invalid7 = @"te" @"s" @@"t"; /* { dg-error "repeated .@. before Objective-C string" } */ +id test_invalid8 = @"te" @@"s" @"t"; /* { dg-error "repeated .@. before Objective-C string" } */ +id test_invalid9 = @"te" @"s" @"t" @; /* { dg-error "stray .@. in program" } */ +id test_invalidA = @"te" @ st; /* { dg-error "stray .@. in program" } */ + /* { dg-error "expected" "" { target *-*-* } 32 } */ Index: gcc/testsuite/obj-c++.dg/strings-2.mm =================================================================== --- gcc/testsuite/obj-c++.dg/strings-2.mm (revision 0) +++ gcc/testsuite/obj-c++.dg/strings-2.mm (revision 0) @@ -0,0 +1,51 @@ +/* Contributed by Nicola Pero , November 2010. */ + +/* { dg-do run } */ +/* { dg-xfail-run-if "Needs OBJC2 ABI" { *-*-darwin* && { lp64 && { ! objc2 } } } { "-fnext-runtime" } { "" } } */ +/* { dg-options "-fconstant-string-class=MyTestString" } */ +/* { dg-options "-mno-constant-cfstrings -fconstant-string-class=MyTestString" { target *-*-darwin* } } */ + +/* { dg-additional-sources "../objc-obj-c++-shared/Object1.mm" } */ + +#include "../objc-obj-c++-shared/Object1.h" +#include "../objc-obj-c++-shared/next-mapping.h" + +#include /* For abort() */ + +@interface MyTestString : Object +{ + char *string; + unsigned int len; +} +/* All strings should contain the C string 'test'. Call -check to + test that this is true. */ +- (void) check; +@end + +@implementation MyTestString +- (void) check +{ + if (len != 4 || string[0] != 't' || string[1] != 'e' + || string[2] != 's' || string[3] != 't' || string[4] != '\0') + abort (); +} +@end + +int main (void) +{ + MyTestString *test_valid1 = @"test"; + MyTestString *test_valid2 = @"te" @"st"; + MyTestString *test_valid3 = @"te" @"s" @"t"; + MyTestString *test_valid4 = @ "t" @ "e" @ "s" @ "t"; + MyTestString *test_valid5 = @ "t" "e" "s" "t"; + MyTestString *test_valid6 = @ "t" "e" "s" @ "t"; + + [test_valid1 check]; + [test_valid2 check]; + [test_valid3 check]; + [test_valid4 check]; + [test_valid5 check]; + [test_valid6 check]; + + return 0; +}