From patchwork Fri Sep 5 17:45:06 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?TWFudWVsIEzDs3Blei1JYsOhw7Fleg==?= X-Patchwork-Id: 386480 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 729FA140076 for ; Sat, 6 Sep 2014 03:45:40 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; q= dns; s=default; b=b1ewBZ2KXHD+zhDggKWpc1e3OuFS1/rwyi+3Zx8q0EmSiv M07CXSSVCOdv8WbGDooLMsSrwPsUFLsKrWU7qFkbE5b413olWAzLBdZ06ffajiM5 3cDWU+Z7xqMCRyPZiMsjIgXvT786KDSsZpZuf55tiw6mWljaNA5sXLelBH3/E= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; s= default; bh=q8UpEJ8HcQetgMZwaorajgB+EM4=; b=py9x7Xmqv0drdD82H4J8 GHy9iZdusiVCEZjasrWfPyMoAv6ZtOva67ZZ0yMb1LcasJfBozxOrCDTQ+R3RVEN o9o5/JtoZ8moe7DcnHxSyvN1VeqnJx6XxarTcV7UQt2N1L0Ms6zboamwJkD/1umi DZDDgr59bhKMx8lczdeKphs= Received: (qmail 14997 invoked by alias); 5 Sep 2014 17:45:34 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 14983 invoked by uid 89); 5 Sep 2014 17:45:32 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=1.3 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, KAM_STOCKTIP, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=no version=3.3.2 X-HELO: mail-wi0-f174.google.com Received: from mail-wi0-f174.google.com (HELO mail-wi0-f174.google.com) (209.85.212.174) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Fri, 05 Sep 2014 17:45:30 +0000 Received: by mail-wi0-f174.google.com with SMTP id d1so160046wiv.1 for ; Fri, 05 Sep 2014 10:45:27 -0700 (PDT) X-Received: by 10.180.102.105 with SMTP id fn9mr5295352wib.27.1409939127258; Fri, 05 Sep 2014 10:45:27 -0700 (PDT) MIME-Version: 1.0 Received: by 10.217.80.73 with HTTP; Fri, 5 Sep 2014 10:45:06 -0700 (PDT) From: =?ISO-8859-1?Q?Manuel_L=F3pez=2DIb=E1=F1ez?= Date: Fri, 5 Sep 2014 19:45:06 +0200 Message-ID: Subject: Encode Wnormalized= is c.opt To: Gcc Patch List , "Joseph S. Myers" X-IsSubscribed: yes This patch moves handling of Wnormalized= to c.opt. There were two quirks when doing this: 1) I cannot use the cpplib.h type 'enum cpp_normalize_level' as Type() because this will require including cpplib.h into options.h, which in turn causes a lot of problems, thus I needed to use Type(int). Similarly, I cannot use this type in the CPP option warn_normalize, because C++ does not allow conversions from int to this enum. 2) The code in c_common_handle_option seems to say that -Wnormalized= is equivalent to -Wnormalized=nfkc. However, -Wnormalized= was already rejected as not valid. Moreover, it emits a note for -Werror=normalized= saying that it is equivalent to -Wnormalized=nfc, however, this note is never actually emitted since the code never reaches that condition, so I chose to not even try to replicate the note or allow -Wnormalized=. Surprisingly, -Werror=normalized= was already equivalent to what -Werror=normalized would do, but -Werror=normalized was rejected because -Wnormalized did not exist. Thus, I added -Wnormalized to handle this corner case. In summary, after the patch the only behavior changes are that -Werror=normalized, -Wnormalized and -Wno-normalized work. Bootstrapped and regression tested on x86_64-linux-gnu OK? gcc/ChangeLog: 2014-09-05 Manuel López-Ibáñez * doc/invoke.texi (Wnormalized=): Update. libcpp/ChangeLog: 2014-09-05 Manuel López-Ibáñez * include/cpplib.h (struct cpp_options): Declare warn_normalize as int instead of enum. gcc/c-family/ChangeLog: 2014-09-05 Manuel López-Ibáñez * c.opt (Wnormalized): New. (Wnormalized=): Use Enum and Reject Negative. * c-opts.c (c_common_handle_option): Do not handle Wnormalized here. gcc/testsuite/ChangeLog: 2014-09-05 Manuel López-Ibáñez * gcc.dg/cpp/warn-normalized-3.c: Delete useless dg-prune-output. Index: gcc/doc/invoke.texi =================================================================== --- gcc/doc/invoke.texi (revision 214904) +++ gcc/doc/invoke.texi (working copy) @@ -260,11 +260,12 @@ Objective-C and Objective-C++ Dialects}. -Wno-int-to-pointer-cast -Wno-invalid-offsetof @gol -Winvalid-pch -Wlarger-than=@var{len} -Wunsafe-loop-optimizations @gol -Wlogical-op -Wlogical-not-parentheses -Wlong-long @gol -Wmain -Wmaybe-uninitialized -Wmemset-transposed-args -Wmissing-braces @gol -Wmissing-field-initializers -Wmissing-include-dirs @gol --Wno-multichar -Wnonnull -Wodr -Wno-overflow -Wopenmp-simd @gol +-Wno-multichar -Wnonnull -Wnormalized=@r{[}none@r{|}id@r{|}nfc@r{|}nfkc@r{]} @gol + -Wodr -Wno-overflow -Wopenmp-simd @gol -Woverlength-strings -Wpacked -Wpacked-bitfield-compat -Wpadded @gol -Wparentheses -Wpedantic-ms-format -Wno-pedantic-ms-format @gol -Wpointer-arith -Wno-pointer-to-int-cast @gol -Wredundant-decls -Wno-return-local-addr @gol -Wreturn-type -Wsequence-point -Wshadow -Wno-shadow-ivar @gol @@ -4918,12 +4919,14 @@ warnings without this one, use @option{- @opindex Wmultichar Do not warn if a multicharacter constant (@samp{'FOOF'}) is used. Usually they indicate a typo in the user's code, as they have implementation-defined values, and should not be used in portable code. -@item -Wnormalized= +@item -Wnormalized@r{[}=@r{<}none@r{|}id@r{|}nfc@r{|}nfkc@r{>]} @opindex Wnormalized= +@opindex Wnormalized +@opindex Wno-normalized @cindex NFC @cindex NFKC @cindex character set, input normalization In ISO C and ISO C++, two identifiers are different if they are different sequences of characters. However, sometimes when characters @@ -4935,24 +4938,26 @@ the same sequence. GCC can warn you if have not been normalized; this option controls that warning. There are four levels of warning supported by GCC@. The default is @option{-Wnormalized=nfc}, which warns about any identifier that is not in the ISO 10646 ``C'' normalized form, @dfn{NFC}. NFC is the -recommended form for most uses. +recommended form for most uses. It is equivalent to +@option{-Wnormalized}. Unfortunately, there are some characters allowed in identifiers by ISO C and ISO C++ that, when turned into NFC, are not allowed in identifiers. That is, there's no way to use these symbols in portable ISO C or C++ and have all your identifiers in NFC@. @option{-Wnormalized=id} suppresses the warning for these characters. It is hoped that future versions of the standards involved will correct this, which is why this option is not the default. You can switch the warning off for all characters by writing -@option{-Wnormalized=none}. You should only do this if you -are using some other normalization scheme (like ``D''), because -otherwise you can easily create bugs that are literally impossible to see. +@option{-Wnormalized=none} or @option{-Wno-normalized}. You should +only do this if you are using some other normalization scheme (like +``D''), because otherwise you can easily create bugs that are +literally impossible to see. Some characters in ISO 10646 have distinct meanings but look identical in some fonts or display methodologies, especially once formatting has been applied. For instance @code{\u207F}, ``SUPERSCRIPT LATIN SMALL LETTER N'', displays just like a regular @code{n} that has been Index: gcc/c-family/c.opt =================================================================== --- gcc/c-family/c.opt (revision 214904) +++ gcc/c-family/c.opt (working copy) @@ -631,13 +639,36 @@ Warn about NULL being passed to argument Wnonnull C ObjC C++ ObjC++ LangEnabledBy(C ObjC C++ ObjC++,Wall) ; +Wnormalized +C ObjC C++ ObjC++ Alias(Wnormalized=,nfc,none) +; + Wnormalized= -C ObjC C++ ObjC++ Joined Warning --Wnormalized= Warn about non-normalised Unicode strings +C ObjC C++ ObjC++ RejectNegative Joined Warning CPP(warn_normalize) Init(normalized_C) Var(cpp_warn_normalize) Enum(cpp_normalize_level) +-Wnormalized= Warn about non-normalised Unicode strings + +; Required for these enum values. +SourceInclude +cpplib.h + +Enum +Name(cpp_normalize_level) Type(int) UnknownError(argument %qs to %<-Wnormalized%> not recognized) + +EnumValue +Enum(cpp_normalize_level) String(none) Value(normalized_none) + +EnumValue +Enum(cpp_normalize_level) String(nfkc) Value(normalized_KC) + +EnumValue +Enum(cpp_normalize_level) String(id) Value(normalized_identifier_C) + +EnumValue +Enum(cpp_normalize_level) String(nfc) Value(normalized_C) Wold-style-cast C++ ObjC++ Var(warn_old_style_cast) Warning Warn if a C-style cast is used in a program Index: gcc/c-family/c-opts.c =================================================================== --- gcc/c-family/c-opts.c (revision 214904) +++ gcc/c-family/c-opts.c (working copy) @@ -382,33 +382,10 @@ c_common_handle_option (size_t scode, co /* ??? Don't add new options here. Use LangEnabledBy in c.opt. */ cpp_opts->warn_num_sign_change = value; break; - case OPT_Wnormalized_: - /* FIXME: Move all this to c.opt. */ - if (kind == DK_ERROR) - { - gcc_assert (!arg); - inform (input_location, "-Werror=normalized=: set -Wnormalized=nfc"); - cpp_opts->warn_normalize = normalized_C; - } - else - { - if (!value || (arg && strcasecmp (arg, "none") == 0)) - cpp_opts->warn_normalize = normalized_none; - else if (!arg || strcasecmp (arg, "nfkc") == 0) - cpp_opts->warn_normalize = normalized_KC; - else if (strcasecmp (arg, "id") == 0) - cpp_opts->warn_normalize = normalized_identifier_C; - else if (strcasecmp (arg, "nfc") == 0) - cpp_opts->warn_normalize = normalized_C; - else - error ("argument %qs to %<-Wnormalized%> not recognized", arg); - break; - } - case OPT_Wunknown_pragmas: /* Set to greater than 1, so that even unknown pragmas in system headers will be warned about. */ /* ??? There is no way to handle this automatically for now. */ warn_unknown_pragmas = value * 2; Index: gcc/testsuite/gcc.dg/cpp/warn-normalized-3.c =================================================================== --- gcc/testsuite/gcc.dg/cpp/warn-normalized-3.c (revision 214904) +++ gcc/testsuite/gcc.dg/cpp/warn-normalized-3.c (working copy) @@ -1,5 +1,4 @@ // { dg-do preprocess } // { dg-options "-std=gnu99 -fdiagnostics-show-option -fextended-identifiers -Werror=normalized=" } /* { dg-message "some warnings being treated as errors" "" {target "*-*-*"} 0 } */ - // { dg-prune-output ".*-Werror=normalized=: set -Wnormalized=nfc.*" } \u0F43 // { dg-error "`.U00000f43' is not in NFC .-Werror=normalized=." } Index: libcpp/include/cpplib.h =================================================================== --- libcpp/include/cpplib.h (revision 214904) +++ libcpp/include/cpplib.h (working copy) @@ -455,12 +455,12 @@ struct cpp_options /* Holds the name of the input character set. */ const char *input_charset; /* The minimum permitted level of normalization before a warning - is generated. */ - enum cpp_normalize_level warn_normalize; + is generated. See enum cpp_normalize_level. */ + int warn_normalize; /* True to warn about precompiled header files we couldn't use. */ bool warn_invalid_pch; /* True if dependencies should be restored from a precompiled header. */