From patchwork Mon Sep 26 14:21:13 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dodji Seketeli X-Patchwork-Id: 116428 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 6E00FB6F72 for ; Tue, 27 Sep 2011 00:23:05 +1000 (EST) Received: (qmail 5582 invoked by alias); 26 Sep 2011 14:22:59 -0000 Received: (qmail 5479 invoked by uid 22791); 26 Sep 2011 14:22:32 -0000 X-SWARE-Spam-Status: No, hits=-4.9 required=5.0 tests=AWL, BAYES_50, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, SPF_HELO_PASS, TW_CP X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 26 Sep 2011 14:21:45 +0000 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p8QELH2x025537 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 26 Sep 2011 10:21:17 -0400 Received: from localhost (ovpn-113-74.phx2.redhat.com [10.3.113.74]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id p8QELEJT025991 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 26 Sep 2011 10:21:15 -0400 Received: by localhost (Postfix, from userid 500) id 36AD9DB3D; Mon, 26 Sep 2011 16:21:13 +0200 (CEST) From: Dodji Seketeli To: Jason Merrill Cc: gcc-patches@gcc.gnu.org, tromey@redhat.com, gdr@integrable-solutions.net, joseph@codesourcery.com, burnus@net-b.de, charlet@act-europe.fr, bonzini@gnu.org Subject: Re: [PATCH 2/7] Generate virtual locations for tokens References: <1291979498-1604-1-git-send-email-dodji@redhat.com> <4E6E65B2.2060909@redhat.com> <4E711F42.80802@redhat.com> <4E77ACFE.2080805@redhat.com> <4E7B673D.9050306@redhat.com> X-URL: http://www.redhat.com Date: Mon, 26 Sep 2011 16:21:13 +0200 In-Reply-To: <4E7B673D.9050306@redhat.com> (Jason Merrill's message of "Thu, 22 Sep 2011 12:50:05 -0400") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux) MIME-Version: 1.0 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org > > abort () added in that case. > > Please update the comment as well. Done. > > > +/* An iterator over tokens coming from a function line macro > > "function-like" Fixed. > > > + /* The function-like macro the tokens come from. */ > > + const macro_arg *arg; > > This field doesn't seem to be used anywhere. Removed. > > > + /* The cpp_reader the macro comes from. */ > > + cpp_reader *pfile; > > This seems to only be used to decide whether or not to increment > location_ptr. Rather than base that decision on going all the way > back to the CPP_OPTION, let's just pass a flag to > macro_arg_token_iter_init and use that to decide whether or not to set > location_ptr. Conceptually, location_ptr is always set, barring one exception: when the token pointing to by the iterator is empty. This can happen when an empty token is passed to a macro. E.g: / | #define M(var...) var | M(); \ In all the other cases, location_ptr is set to the virtual location of the token. In the particular case where we are not tracking macro expansions (thus virtual location == spelling location) then location_ptr is set to &token->src_loc. It's the incrementing of location_ptr that is indeed done only when we are tracking macro expansions. In any case, I agree that storing pfile in the iterator is too heavy. I have added a track_macro_exp_p flag to struct macro_arg_token_iter and removed the pfile member. I have updated the functions that were expecting a pfile in macro_arg_token_iter. > > +/* Return the location of the token pointed to by the iterator.*/ > > +static source_location > > +macro_arg_token_iter_get_location (const macro_arg_token_iter *it) > > +{ > > +#ifdef ENABLE_CHECKING > > + if (it->kind == MACRO_ARG_TOKEN_STRINGIFIED > > + && it->num_forwards > 0) > > + abort (); > > +#endif > > + return *it->location_ptr; > > +} > > And then here if location_ptr isn't set we should get the location > from the token. This is not needed here, as it->location_ptr is always set, as I explained above. > > > + if (virt_location) > > + { > > + if (track_macro_exp_p) > > + { > > + if (kind == MACRO_ARG_TOKEN_NORMAL) > > + *virt_location = &arg->virt_locs[index]; > > + else if (kind == MACRO_ARG_TOKEN_EXPANDED) > > + *virt_location = &arg->expanded_virt_locs[index]; > > + else if (kind == MACRO_ARG_TOKEN_STRINGIFIED) > > + *virt_location = > > + (source_location *) &tokens_ptr[index]->src_loc; > > + } > > + else > > Similarly, here virt_location should only be set when we're tracking > macro expansions, so the second test becomes redundant. Fixed. I have updated get_arg_token_location to get the location from the token when we are not tracking macro expansions and adjusted set_arg_token to avoid updating the separate virtual location if we are not tracking macro expansions. > > > +tokens_buff_new (cpp_reader *pfile, size_t len, > > + source_location **virt_locs) > > +{ > > + bool track_macro_exp_p = CPP_OPTION (pfile, track_macro_expansion); > > + size_t tokens_size = len * sizeof (cpp_token *); > > + size_t locs_size = len * sizeof (source_location); > > + > > + if (track_macro_exp_p && virt_locs != NULL) > > + *virt_locs = XNEWVEC (source_location, locs_size); > > And here. Fixed. > > > + *num_args = num_args_alloced;; > > Extra ; > > > + num_args_alloced++; > > > > argc++; > > How does num_args_alloced differ from argc? because of this: /* A single empty argument is counted as no argument. */ if (argc == 1 && macro->paramc == 0 && args[0].count == 0) argc = 0; which happens when a variadic function-like macro is passed one empty argument. In that case, argc is zero even though one arg has been allocated. I need to know the number of args that were allocated by collect_args so that enter_macro_context can correctly delete those args (after indirectly collecting them with funlike_invocation_p) by calling delete_macro_args. > > > + result = > > + tokens_buff_put_token_to (pfile, (const cpp_token **) BUFF_FRONT (buffer), > > + &virt_locs[token_index], > > + token, def_loc, parm_def_loc, > > + map, macro_token_index); > > Here if virt_locs is null we should pass down null as well. OK. > > > + if (track_macro_exp_p) > > + { > > + if (map) > > + macro_loc = linemap_add_macro_token (map, macro_token_index, > > + def_loc, parm_def_loc); > > + *virt_loc_dest = macro_loc; > > + } > > So that here again we can just check that virt_loc_dest is set rather > than the CPP_OPTION. OK. > > > pfile->context = context->prev; > > + /* decrease peak memory consumption by feeing the context. */ > > + pfile->context->next = NULL; > > + free (context); > > Setting pfile->context->next to NULL seems wrong; either it's already > NULL or we're making something unreachable. My understanding is that contexts are allocated with next_context, and used in a stack-ish way only by _cpp_push_*_context. And _cpp_pop_context frees that memory (it was previously recycling it for reuse by next_context). next_context tests that context->next is non-null (which means there is no context to reuse) and then allocates the memory. I think the reason why nothing was setting context->next to null before is because the contexts were being recycled. So I fail to see what would be made unreachable here. > > > + LOC is an out parameter; *LOC is set to the location "as expected > > + by the user". */ > > This is puzzling without the explanation before > cpp_get_token_with_location; just refer to that comment here. Done. > > In cpp_get_token_1 the distinction between code paths that set > virt_loc and those that set *location directly seems unfortunate; I > would think it would be cleaner to do > > > + if (location) > { > if (virt_loc == 0) virt_loc = result->src_loc; > > + *location = virt_loc; > Done. > and drop the direct settings of *location/gotos earlier in the > function. Note that the gotos were put there also because we needed to get out of the for (;;) loop, similarly to what the previous return statements were doing; so by doing this doesn't we don't do get rid of the gotos. I have bootstrapped and tested this patch on x86_64-unknown-linux-gnu against recent trunk. From: Dodji Seketeli Date: Sat, 4 Dec 2010 14:04:29 +0100 Subject: [PATCH 2/7] Generate virtual locations for tokens This second instalment uses the infrastructure of the previous patch to allocate a macro map for each macro expansion and assign a virtual location to each token resulting from the expansion. To date when cpp_get_token comes across a token that happens to be a macro, the macro expander kicks in, expands the macro, pushes the resulting tokens onto a "token context" and returns a dummy padding token. The next call to cpp_get_token goes look into the token context for the next token [which is going to result from the previous macro expansion] and returns it. If the token is a macro, the macro expander kicks in and you know the story. This patch piggy-backs on that macro expansion process, so to speak. First it modifies the macro expander to make it create a macro map for each macro expansion. It then allocates a virtual location for each resulting token. Virtual locations of tokens resulting from macro expansions are then stored on a special kind of context called an "expanded tokens context". In other words, in an expanded tokens context, there are tokens resulting from macro expansion and their associated virtual locations. cpp_get_token_with_location is modified to return the virtual location of tokens resulting from macro expansion. Note that once all tokens from an expanded token context have been consumed and the context and is freed, the memory used to store the virtual locations of the tokens held in that context is freed as well. This helps reducing the overall peak memory consumption. The client code that was getting macro expansion point location from cpp_get_token_with_location now gets virtual location from it. Those virtual locations can in turn be resolved into the different interesting physical locations thanks to the linemap API exposed by the previous patch. Expensive progress. Possibly. So this whole virtual location allocation business is switched off by default. So by default no extended token is created. No extended token context is created either. One has to use -ftrack-macro-expansion to switch this on. This complicates the code but I believe it can be useful as some of our friends found out at http://llvm.org/bugs/show_bug.cgi?id=5610 The patch tries to reduce the memory consumption by freeing some token context memory that was being reused before. I didn't notice any compilation slow down due to this immediate freeing on my GNU/Linux system. As no client code tries to resolve virtual locations to anything but what was being done before, no new test case has been added. The combination of this patch and the previous one bootstraps with --enable-languages=all,ada and passes regression tests on x86_64-unknown-linux-gnu. gcc/ * doc/cppopts.texi (-ftrack-macro-expansion): Document new option. * doc/invoke.texi (-ftrack-macro-expansion): Add this to the list of preprocessor related options. gcc/c-family/ * c.opt (ftrack-macro-expansion): New option. Handle it with and without argument. * c-opts.c (c_common_handle_option): New cases. Handle -ftrack-macro-expansion with and without argument. libcpp/ * include/cpplib.h (struct cpp_options): New option. * internal.h (struct macro_context): New struct. (enum context_tokens_kind): New enum. (struct cpp_context): New member of type enum context_tokens_kind. (struct cpp_context): Remove this. Replace it with an enum of macro and macro_context. (struct cpp_context): Remove. (_cpp_remaining_tokens_num_in_context): Declare new function. * directives.c (destringize_and_run): Adjust. * lex.c (_cpp_remaining_tokens_num_in_context) (_cpp_token_from_context_at): Define new functions (cpp_peek_token): Use them. * init.c (cpp_create_reader): Initialize the base context to zero. (_cpp_token_from_context_at): Define new static function. (cpp_peek_token): Use new _cpp_remaining_tokens_num_in_context and _cpp_token_from_context_at. * macro.c (struct macro_arg): New members. (enum macro_arg_token_kind): New enum. (struct macro_arg_token_iter): New struct. (maybe_adjust_loc_for_trad_cpp, push_extended_tokens_context) (alloc_expanded_arg_mem, ensure_expanded_arg_room) (delete_macro_args, set_arg_token, get_arg_token_location) (arg_token_ptr_at, macro_arg_token_iter_init) (macro_arg_token_iter_get_token) (macro_arg_token_iter_get_location, macro_arg_token_iter_forward) (expanded_token_index, tokens_buff_new, tokens_buff_count) (tokens_buff_last_token_ptr, tokens_buff_put_token_to) (tokens_buff_add_token, tokens_buff_remove_last_token) (reached_end_of_context, consume_next_token_from_context): New static functions. (cpp_get_token_1): New static function. Split and extended from cpp_get_token. Use reached_end_of_context and consume_next_token_from_context. Unify its return point. Move the location tweaking from cpp_get_token_with_location in here. (cpp_get_token): Use cpp_get_token_1 (stringify_arg): Use the new arg_token_at. (paste_all_tokens): Support tokens coming from extended tokens contexts. (collect_args): Return the number of collected arguments, by parameter. Store virtual locations of tokens that constitute the collected args. (funlike_invocation_p): Return the number of collected arguments, by parameter. (enter_macro_context): Add a parameter for macro expansion point. Pass it to replace_args and to the "used" cpp callback. Get the number of function-like macro arguments from funlike_invocation_p, pass it to the new delete_macro_args to free the memory used by macro args. When -ftrack-macro-expansion is in effect, for macros that have no arguments, create a macro map for the macro expansion and use it to allocate proper virtual locations for tokens resulting from the expansion. Push an extended tokens context containing the tokens resulting from macro expansion and their virtual locations. (replace_args): Rename the different variables named 'count' into variables with more meaningful names. Create a macro map; allocate virtual locations of tokens resulting from this expansion. Use macro_arg_token_iter to iterate over tokens of a given macro. Handle the case of the argument of -ftrack-macro-expansion being < 2. Don't free macro arguments memory resulting from expand_arg here, as these are freed by the caller of replace_arg using delete_macro_args now. Push extended token context. (next_context, push_ptoken_context, _cpp_push_token_context) (_cpp_push_text_context): Properly initialize the context. (expand_arg): Use the new alloc_expanded_arg_mem, push_extended_tokens_context, cpp_get_token_1, and set_arg_token. (_cpp_pop_context): Really free the memory held by the context. Handle freeing memory used by extended tokens contexts. (cpp_get_token_with_location): Use cpp_get_token_1. (cpp_sys_macro_p): Adjust. (_cpp_backup_tokens): Support the new kinds of token contexts. * traditional.c (recursive_macro): Adjust. --- gcc/c-family/c-opts.c | 12 + gcc/c-family/c.opt | 8 + gcc/doc/cppopts.texi | 18 + gcc/doc/invoke.texi | 6 +- gcc/input.c | 2 +- libcpp/directives.c | 4 +- libcpp/include/cpplib.h | 8 + libcpp/include/line-map.h | 2 +- libcpp/init.c | 3 +- libcpp/internal.h | 58 ++- libcpp/lex.c | 41 ++- libcpp/line-map.c | 2 +- libcpp/macro.c | 1327 ++++++++++++++++++++++++++++++++++++++++----- libcpp/traditional.c | 2 +- 14 files changed, 1340 insertions(+), 153 deletions(-) diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c index 49ff80d..3184539 100644 --- a/gcc/c-family/c-opts.c +++ b/gcc/c-family/c-opts.c @@ -628,6 +628,18 @@ c_common_handle_option (size_t scode, const char *arg, int value, cpp_opts->preprocessed = value; break; + case OPT_ftrack_macro_expansion: + if (value) + value = 2; + /* Fall Through. */ + + case OPT_ftrack_macro_expansion_: + if (arg && *arg != '\0') + cpp_opts->track_macro_expansion = value; + else + cpp_opts->track_macro_expansion = 2; + break; + case OPT_frepo: flag_use_repository = value; if (value) diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index e6ac5dc..07a6b87 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -941,6 +941,14 @@ fpreprocessed C ObjC C++ ObjC++ Treat the input file as already preprocessed +ftrack-macro-expansion +C ObjC C++ ObjC++ JoinedOrMissing RejectNegative UInteger +; converted into ftrack-macro-expansion= + +ftrack-macro-expansion= +C ObjC C++ ObjC++ JoinedOrMissing RejectNegative UInteger +-ftrack-macro-expansion=<0|1|2> Track locations of tokens coming from macro expansion and display them in error messages + fpretty-templates C++ ObjC++ Var(flag_pretty_templates) Init(1) -fno-pretty-templates Do not pretty-print template specializations as the template signature followed by the arguments diff --git a/gcc/doc/cppopts.texi b/gcc/doc/cppopts.texi index 5212478..b225236 100644 --- a/gcc/doc/cppopts.texi +++ b/gcc/doc/cppopts.texi @@ -583,6 +583,24 @@ correct column numbers in warnings or errors, even if tabs appear on the line. If the value is less than 1 or greater than 100, the option is ignored. The default is 8. +@item -ftrack-macro-expansion@r{[}=@var{level}@r{]} +@opindex ftrack-macro-expansion +Track locations of tokens across macro expansions. This allows the +compiler to emit diagnostic about the current macro expansion stack +when a compilation error occurs in a macro expansion. Using this +option makes the preprocessor and the compiler consume more +memory. The @var{level} parameter can be used to choose the level of +precision of token location tracking thus decreasing the memory +consumption if necessary. Value @samp{0} of @var{level} de-activates +this option just as if no @option{-ftrack-macro-expansion} was present +on the command line. Value @samp{1} tracks tokens locations in a +degraded mode for the sake of minimal memory overhead. In this mode +all tokens resulting from the expansion of an argument of a +function-like macro have the same location. Value @samp{2} tracks +tokens locations completely. This value is the most memory hungry. +When this option is given no argument, the default parameter value is +@samp{2}. + @item -fexec-charset=@var{charset} @opindex fexec-charset @cindex character set, execution diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 957d75c..7e1b7c2 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -428,9 +428,9 @@ Objective-C and Objective-C++ Dialects}. -iwithprefixbefore @var{dir} -isystem @var{dir} @gol -imultilib @var{dir} -isysroot @var{dir} @gol -M -MM -MF -MG -MP -MQ -MT -nostdinc @gol --P -fworking-directory -remap @gol --trigraphs -undef -U@var{macro} -Wp,@var{option} @gol --Xpreprocessor @var{option}} +-P -ftrack-macro-expansion -fworking-directory @gol +-remap -trigraphs -undef -U@var{macro} @gol +-Wp,@var{option} -Xpreprocessor @var{option}} @item Assembler Option @xref{Assembler Options,,Passing Options to the Assembler}. diff --git a/gcc/input.c b/gcc/input.c index 83344d7..89af274 100644 --- a/gcc/input.c +++ b/gcc/input.c @@ -1,5 +1,5 @@ /* Data and functions related to line maps and input files. - Copyright (C) 2004, 2007, 2008, 2009, 2010 + Copyright (C) 2004, 2007, 2008, 2009, 2010, 2011 Free Software Foundation, Inc. This file is part of GCC. diff --git a/libcpp/directives.c b/libcpp/directives.c index a62ddeb..0510c6e 100644 --- a/libcpp/directives.c +++ b/libcpp/directives.c @@ -1,7 +1,7 @@ /* CPP Library. (Directive handling.) Copyright (C) 1986, 1987, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, - 2007, 2008, 2009, 2010 Free Software Foundation, Inc. + 2007, 2008, 2009, 2010, 2011 Free Software Foundation, Inc. Contributed by Per Bothner, 1994-95. Based on CCCP program by Paul Rubin, June 1986 Adapted to ANSI C, Richard Stallman, Jan 1987 @@ -1742,7 +1742,7 @@ destringize_and_run (cpp_reader *pfile, const cpp_string *in) saved_cur_run = pfile->cur_run; pfile->context = XNEW (cpp_context); - pfile->context->macro = 0; + pfile->context->c.macro = 0; pfile->context->prev = 0; pfile->context->next = 0; diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h index 0e90821..3e01c11 100644 --- a/libcpp/include/cpplib.h +++ b/libcpp/include/cpplib.h @@ -393,6 +393,14 @@ struct cpp_options bother trying to do macro expansion and whatnot. */ unsigned char preprocessed; + /* Nonzero means we are tracking locations of tokens involved in + macro expansion. 1 Means we track the location in degraded mode + where we do not track locations of tokens resulting from the + expansion of arguments of function-like macro. 2 Means we do + track all macro expansions. This last option is the one that + consumes the highest amount of memory. */ + unsigned char track_macro_expansion; + /* Nonzero means handle C++ alternate operator names. */ unsigned char operator_names; diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h index 5b7ee9d..af94f32 100644 --- a/libcpp/include/line-map.h +++ b/libcpp/include/line-map.h @@ -1,5 +1,5 @@ /* Map logical line numbers to (source file, line number) pairs. - Copyright (C) 2001, 2003, 2004, 2007, 2008, 2009, 2010 + Copyright (C) 2001, 2003, 2004, 2007, 2008, 2009, 2010, 2011 Free Software Foundation, Inc. This program is free software; you can redistribute it and/or modify it diff --git a/libcpp/init.c b/libcpp/init.c index 6303868..6771e63 100644 --- a/libcpp/init.c +++ b/libcpp/init.c @@ -154,6 +154,7 @@ cpp_create_reader (enum c_lang lang, hash_table *table, init_library (); pfile = XCNEW (cpp_reader); + memset (&pfile->base_context, 0, sizeof (pfile->base_context)); cpp_set_lang (pfile, lang); CPP_OPTION (pfile, warn_multichar) = 1; @@ -213,7 +214,7 @@ cpp_create_reader (enum c_lang lang, hash_table *table, /* Initialize the base context. */ pfile->context = &pfile->base_context; - pfile->base_context.macro = 0; + pfile->base_context.c.macro = 0; pfile->base_context.prev = pfile->base_context.next = 0; /* Aligned and unaligned storage. */ diff --git a/libcpp/internal.h b/libcpp/internal.h index 588e8ed..fba9a28 100644 --- a/libcpp/internal.h +++ b/libcpp/internal.h @@ -1,6 +1,6 @@ /* Part of CPP library. Copyright (C) 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2007, - 2008, 2009, 2010 Free Software Foundation, Inc. + 2008, 2009, 2010, 2011 Free Software Foundation, Inc. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the @@ -139,6 +139,40 @@ struct tokenrun #define CUR(c) ((c)->u.trad.cur) #define RLIMIT(c) ((c)->u.trad.rlimit) +/* This describes some additional data that is added to the macro + token context of type cpp_context, when -ftrack-macro-expansion is + on. */ +typedef struct +{ + /* The node of the macro we are referring to. */ + cpp_hashnode *macro_node; + /* This buffer contains an array of virtual locations. The virtual + location at index 0 is the virtual location of the token at index + 0 in the current instance of cpp_context; similarly for all the + other virtual locations. */ + source_location *virt_locs; + /* This is a pointer to the current virtual location. This is used + to iterate over the virtual locations while we iterate over the + tokens they belong to. */ + source_location *cur_virt_loc; +} macro_context; + +/* The kind of tokens carried by a cpp_context. */ +enum context_tokens_kind { + /* This is the value of cpp_context::tokens_kind if u.iso.first + contains an instance of cpp_token **. */ + TOKENS_KIND_INDIRECT, + /* This is the value of cpp_context::tokens_kind if u.iso.first + contains an instance of cpp_token *. */ + TOKENS_KIND_DIRECT, + /* This is the value of cpp_context::tokens_kind when the token + context contains tokens resulting from macro expansion. In that + case struct cpp_context::macro points to an instance of struct + macro_context. This is used only when the + -ftrack-macro-expansion flag is on. */ + TOKENS_KIND_EXTENDED +}; + typedef struct cpp_context cpp_context; struct cpp_context { @@ -168,11 +202,24 @@ struct cpp_context When the context is popped, the buffer is released. */ _cpp_buff *buff; - /* For a macro context, the macro node, otherwise NULL. */ - cpp_hashnode *macro; + /* If tokens_kind is TOKEN_KIND_EXTENDED, then (as we thus are in a + macro context) this is a pointer to an instance of macro_context. + Otherwise if tokens_kind is *not* TOKEN_KIND_EXTENDED, then, if + we are in a macro context, this is a pointer to an instance of + cpp_hashnode, representing the name of the macro this context is + for. If we are not in a macro context, then this is just NULL. + Note that when tokens_kind is TOKEN_KIND_EXTENDED, the memory + used by the instance of macro_context pointed to by this member + is de-allocated upon de-allocation of the instance of struct + cpp_context. */ + union + { + macro_context *mc; + cpp_hashnode *macro; + } c; - /* True if utoken element is token, else ptoken. */ - bool direct_p; + /* This determines the type of tokens held by this context. */ + enum context_tokens_kind tokens_kind; }; struct lexer_state @@ -605,6 +652,7 @@ extern cpp_token *_cpp_lex_direct (cpp_reader *); extern int _cpp_equiv_tokens (const cpp_token *, const cpp_token *); extern void _cpp_init_tokenrun (tokenrun *, unsigned int); extern cpp_hashnode *_cpp_lex_identifier (cpp_reader *, const char *); +extern int _cpp_remaining_tokens_num_in_context (cpp_reader *); /* In init.c. */ extern void _cpp_maybe_push_include_file (cpp_reader *); diff --git a/libcpp/lex.c b/libcpp/lex.c index 75b2b1d..cd6ae9f 100644 --- a/libcpp/lex.c +++ b/libcpp/lex.c @@ -1703,6 +1703,38 @@ next_tokenrun (tokenrun *run) return run->next; } +/* Return the number of not yet processed token in the the current + context. */ +int +_cpp_remaining_tokens_num_in_context (cpp_reader *pfile) +{ + cpp_context *context = pfile->context; + if (context->tokens_kind == TOKENS_KIND_DIRECT) + return ((LAST (context).token - FIRST (context).token) + / sizeof (cpp_token)); + else if (context->tokens_kind == TOKENS_KIND_INDIRECT + || context->tokens_kind == TOKENS_KIND_EXTENDED) + return ((LAST (context).ptoken - FIRST (context).ptoken) + / sizeof (cpp_token *)); + else + abort (); +} + +/* Returns the token present at index INDEX in the current context. + If INDEX is zero, the next token to be processed is returned. */ +static const cpp_token* +_cpp_token_from_context_at (cpp_reader *pfile, int index) +{ + cpp_context *context = pfile->context; + if (context->tokens_kind == TOKENS_KIND_DIRECT) + return &(FIRST (context).token[index]); + else if (context->tokens_kind == TOKENS_KIND_INDIRECT + || context->tokens_kind == TOKENS_KIND_EXTENDED) + return FIRST (context).ptoken[index]; + else + abort (); +} + /* Look ahead in the input stream. */ const cpp_token * cpp_peek_token (cpp_reader *pfile, int index) @@ -1714,15 +1746,10 @@ cpp_peek_token (cpp_reader *pfile, int index) /* First, scan through any pending cpp_context objects. */ while (context->prev) { - ptrdiff_t sz = (context->direct_p - ? LAST (context).token - FIRST (context).token - : LAST (context).ptoken - FIRST (context).ptoken); + ptrdiff_t sz = _cpp_remaining_tokens_num_in_context (pfile); if (index < (int) sz) - return (context->direct_p - ? FIRST (context).token + index - : *(FIRST (context).ptoken + index)); - + return _cpp_token_from_context_at (pfile, index); index -= (int) sz; context = context->prev; } diff --git a/libcpp/line-map.c b/libcpp/line-map.c index 959566c..9b15bb6 100644 --- a/libcpp/line-map.c +++ b/libcpp/line-map.c @@ -1,5 +1,5 @@ /* Map logical line numbers to (source file, line number) pairs. - Copyright (C) 2001, 2003, 2004, 2007, 2008, 2009 + Copyright (C) 2001, 2003, 2004, 2007, 2008, 2009, 2010, 2011 Free Software Foundation, Inc. This program is free software; you can redistribute it and/or modify it diff --git a/libcpp/macro.c b/libcpp/macro.c index 03fe79e..bceb128 100644 --- a/libcpp/macro.c +++ b/libcpp/macro.c @@ -1,7 +1,7 @@ /* Part of CPP library. (Macro and #define handling.) Copyright (C) 1986, 1987, 1989, 1992, 1993, 1994, 1995, 1996, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, - 2006, 2007, 2008, 2009, 2010 Free Software Foundation, Inc. + 2006, 2007, 2008, 2009, 2010, 2011 Free Software Foundation, Inc. Written by Per Bothner, 1994. Based on CCCP program by Paul Rubin, June 1986 Adapted to ANSI C, Richard Stallman, Jan 1987 @@ -30,6 +30,10 @@ along with this program; see the file COPYING3. If not see #include "internal.h" typedef struct macro_arg macro_arg; +/* This structure represents the tokens of a macro argument. These + tokens can be macro themselves, in which case they can be either + expanded or unexpanded. When they are expanded, this data + structure keeps both the expanded and unexpanded forms. */ struct macro_arg { const cpp_token **first; /* First token in unexpanded argument. */ @@ -37,17 +41,59 @@ struct macro_arg const cpp_token *stringified; /* Stringified argument. */ unsigned int count; /* # of tokens in argument. */ unsigned int expanded_count; /* # of tokens in expanded argument. */ + source_location *virt_locs; /* Where virtual locations for + unexpanded tokens are stored. */ + source_location *expanded_virt_locs; /* Where virtual locations for + expanded tokens are + stored. */ +}; + +/* The kind of macro tokens which the instance of + macro_arg_token_iter is supposed to iterate over. */ +enum macro_arg_token_kind { + MACRO_ARG_TOKEN_NORMAL, + /* This is a macro argument token that got transformed into a string + litteral, e.g. #foo. */ + MACRO_ARG_TOKEN_STRINGIFIED, + /* This is a token resulting from the expansion of a macro + argument that was itself a macro. */ + MACRO_ARG_TOKEN_EXPANDED +}; + +/* An iterator over tokens coming from a function-like macro + argument. */ +typedef struct macro_arg_token_iter macro_arg_token_iter; +struct macro_arg_token_iter +{ + /* Whether or not -ftrack-macro-expansion is used. */ + bool track_macro_exp_p; + /* The kind of token over which we are supposed to iterate. */ + enum macro_arg_token_kind kind; + /* A pointer to the current token pointed to by the iterator. */ + const cpp_token **token_ptr; + /* A pointer to the "full" location of the current token. If + -ftrack-macro-expansion is used this location tracks loci accross + macro expansion. */ + const source_location *location_ptr; +#ifdef ENABLE_CHECKING + /* The number of times the iterator went forward. This useful only + when checking is enabled. */ + size_t num_forwards; +#endif }; /* Macro expansion. */ static int enter_macro_context (cpp_reader *, cpp_hashnode *, - const cpp_token *); + const cpp_token *, source_location); static int builtin_macro (cpp_reader *, cpp_hashnode *); static void push_ptoken_context (cpp_reader *, cpp_hashnode *, _cpp_buff *, const cpp_token **, unsigned int); +static void push_extended_tokens_context (cpp_reader *, cpp_hashnode *, + _cpp_buff *, source_location *, + const cpp_token **, unsigned int); static _cpp_buff *collect_args (cpp_reader *, const cpp_hashnode *, - _cpp_buff **); + _cpp_buff **, unsigned *); static cpp_context *next_context (cpp_reader *); static const cpp_token *padding_token (cpp_reader *, const cpp_token *); static void expand_arg (cpp_reader *, macro_arg *); @@ -55,10 +101,54 @@ static const cpp_token *new_string_token (cpp_reader *, uchar *, unsigned int); static const cpp_token *stringify_arg (cpp_reader *, macro_arg *); static void paste_all_tokens (cpp_reader *, const cpp_token *); static bool paste_tokens (cpp_reader *, const cpp_token **, const cpp_token *); +static void alloc_expanded_arg_mem (cpp_reader *, macro_arg *, size_t); +static void ensure_expanded_arg_room (cpp_reader *, macro_arg *, size_t, size_t *); +static void delete_macro_args (_cpp_buff*, unsigned num_args); +static void set_arg_token (macro_arg *, const cpp_token *, + source_location, size_t, + enum macro_arg_token_kind, + bool); +static const source_location *get_arg_token_location (const macro_arg *, + enum macro_arg_token_kind, + bool); +static const cpp_token **arg_token_ptr_at (const macro_arg *, + size_t, + enum macro_arg_token_kind, + source_location **virt_location); + +static void macro_arg_token_iter_init (macro_arg_token_iter *, bool, + enum macro_arg_token_kind, + const macro_arg *, + const cpp_token **); +static const cpp_token *macro_arg_token_iter_get_token +(const macro_arg_token_iter *it); +static source_location macro_arg_token_iter_get_location +(const macro_arg_token_iter *); +static void macro_arg_token_iter_forward (macro_arg_token_iter *); +static _cpp_buff *tokens_buff_new (cpp_reader *, size_t, + source_location **); +static size_t tokens_buff_count (_cpp_buff *); +static const cpp_token **tokens_buff_last_token_ptr (_cpp_buff *); +static const cpp_token **tokens_buff_put_token_to (const cpp_token **, + source_location *, + const cpp_token *, + source_location, + source_location, + const struct line_map *, + unsigned int); + +static const cpp_token **tokens_buff_add_token (_cpp_buff *, + source_location *, + const cpp_token *, + source_location, + source_location, + const struct line_map *, + unsigned int); +static void tokens_buff_remove_last_token (_cpp_buff *); static void replace_args (cpp_reader *, cpp_hashnode *, cpp_macro *, - macro_arg *); + macro_arg *, source_location); static _cpp_buff *funlike_invocation_p (cpp_reader *, cpp_hashnode *, - _cpp_buff **); + _cpp_buff **, unsigned *); static bool create_iso_definition (cpp_reader *, cpp_macro *); /* #define directive parsing and handling. */ @@ -70,6 +160,11 @@ static bool warn_of_redefinition (cpp_reader *, cpp_hashnode *, static bool parse_params (cpp_reader *, cpp_macro *); static void check_trad_stringification (cpp_reader *, const cpp_macro *, const cpp_string *); +static bool reached_end_of_context (cpp_context *); +static void consume_next_token_from_context (cpp_reader *pfile, + const cpp_token **, + source_location *); +static const cpp_token* cpp_get_token_1 (cpp_reader *, source_location *); /* Emits a warning if NODE is a macro defined in the main file that has not been used. */ @@ -507,7 +602,7 @@ paste_tokens (cpp_reader *pfile, const cpp_token **plhs, const cpp_token *rhs) static void paste_all_tokens (cpp_reader *pfile, const cpp_token *lhs) { - const cpp_token *rhs; + const cpp_token *rhs = NULL; cpp_context *context = pfile->context; do @@ -517,10 +612,25 @@ paste_all_tokens (cpp_reader *pfile, const cpp_token *lhs) object-like macro, or a function-like macro with arguments inserted. In either case, the constraints to #define guarantee we have at least one more token. */ - if (context->direct_p) + if (context->tokens_kind == TOKENS_KIND_DIRECT) rhs = FIRST (context).token++; - else + else if (context->tokens_kind == TOKENS_KIND_INDIRECT) rhs = *FIRST (context).ptoken++; + else if (context->tokens_kind == TOKENS_KIND_EXTENDED) + { + /* So we are in presence of an extended token context, which + means that each token in this context has a virtual + location attached to it. So let's not forget to update + the pointer to the current virtual location of the + current token when we update the pointer to the current + token */ + + rhs = *FIRST (context).ptoken++; + /* context->c.mc must be non-null, as if we were not in a + macro context, context->tokens_kind could not be equal to + TOKENS_KIND_EXTENDED. */ + context->c.mc->cur_virt_loc++; + } if (rhs->type == CPP_PADDING) { @@ -584,23 +694,37 @@ _cpp_arguments_ok (cpp_reader *pfile, cpp_macro *macro, const cpp_hashnode *node NULL. Each argument is terminated by a CPP_EOF token, for the future benefit of expand_arg(). If there are any deferred #pragma directives among macro arguments, store pointers to the - CPP_PRAGMA ... CPP_PRAGMA_EOL tokens into *PRAGMA_BUFF buffer. */ + CPP_PRAGMA ... CPP_PRAGMA_EOL tokens into *PRAGMA_BUFF buffer. + + What is returned is the buffer that contains the memory allocated + to hold the macro arguments. NODE is the name of the macro this + function is dealing with. If NUM_ARGS is non-NULL, *NUM_ARGS is + set to the actual number of macro arguments allocated in the + returned buffer. */ static _cpp_buff * collect_args (cpp_reader *pfile, const cpp_hashnode *node, - _cpp_buff **pragma_buff) + _cpp_buff **pragma_buff, unsigned *num_args) { _cpp_buff *buff, *base_buff; cpp_macro *macro; macro_arg *args, *arg; const cpp_token *token; unsigned int argc; + source_location virt_loc; + bool track_macro_expansion_p = CPP_OPTION (pfile, track_macro_expansion); + unsigned num_args_alloced = 0; macro = node->value.macro; if (macro->paramc) argc = macro->paramc; else argc = 1; - buff = _cpp_get_buff (pfile, argc * (50 * sizeof (cpp_token *) + +#define DEFAULT_NUM_TOKENS_PER_MACRO_ARG 50 +#define ARG_TOKENS_EXTENT 1000 + + buff = _cpp_get_buff (pfile, argc * (DEFAULT_NUM_TOKENS_PER_MACRO_ARG + * sizeof (cpp_token *) + sizeof (macro_arg))); base_buff = buff; args = (macro_arg *) buff->base; @@ -615,9 +739,17 @@ collect_args (cpp_reader *pfile, const cpp_hashnode *node, { unsigned int paren_depth = 0; unsigned int ntokens = 0; + unsigned virt_locs_capacity = DEFAULT_NUM_TOKENS_PER_MACRO_ARG; + num_args_alloced++; argc++; arg->first = (const cpp_token **) buff->cur; + if (track_macro_expansion_p) + { + virt_locs_capacity = DEFAULT_NUM_TOKENS_PER_MACRO_ARG; + arg->virt_locs = XNEWVEC (source_location, + virt_locs_capacity); + } for (;;) { @@ -625,11 +757,20 @@ collect_args (cpp_reader *pfile, const cpp_hashnode *node, if ((unsigned char *) &arg->first[ntokens + 2] > buff->limit) { buff = _cpp_append_extend_buff (pfile, buff, - 1000 * sizeof (cpp_token *)); + ARG_TOKENS_EXTENT + * sizeof (cpp_token *)); arg->first = (const cpp_token **) buff->cur; } + if (track_macro_expansion_p + && (ntokens + 2 > virt_locs_capacity)) + { + virt_locs_capacity += ARG_TOKENS_EXTENT; + arg->virt_locs = XRESIZEVEC (source_location, + arg->virt_locs, + virt_locs_capacity); + } - token = cpp_get_token (pfile); + token = cpp_get_token_1 (pfile, &virt_loc); if (token->type == CPP_PADDING) { @@ -686,7 +827,7 @@ collect_args (cpp_reader *pfile, const cpp_hashnode *node, BUFF_FRONT (*pragma_buff) += sizeof (cpp_token *); if (token->type == CPP_PRAGMA_EOL) break; - token = cpp_get_token (pfile); + token = cpp_get_token_1 (pfile, &virt_loc); } while (token->type != CPP_EOF); @@ -700,8 +841,10 @@ collect_args (cpp_reader *pfile, const cpp_hashnode *node, else continue; } - - arg->first[ntokens++] = token; + set_arg_token (arg, token, virt_loc, + ntokens, MACRO_ARG_TOKEN_NORMAL, + CPP_OPTION (pfile, track_macro_expansion)); + ntokens++; } /* Drop trailing padding. */ @@ -709,7 +852,9 @@ collect_args (cpp_reader *pfile, const cpp_hashnode *node, ntokens--; arg->count = ntokens; - arg->first[ntokens] = &pfile->eof; + set_arg_token (arg, &pfile->eof, pfile->eof.src_loc, + ntokens, MACRO_ARG_TOKEN_NORMAL, + CPP_OPTION (pfile, track_macro_expansion)); /* Terminate the argument. Excess arguments loop back and overwrite the final legitimate argument, before failing. */ @@ -752,6 +897,8 @@ collect_args (cpp_reader *pfile, const cpp_hashnode *node, || (argc == 1 && args[0].count == 0 && !CPP_OPTION (pfile, std)))) args[macro->paramc - 1].first = NULL; + if (num_args) + *num_args = num_args_alloced; return base_buff; } } @@ -765,10 +912,12 @@ collect_args (cpp_reader *pfile, const cpp_hashnode *node, way that, if none is found, we don't lose the information in any intervening padding tokens. If we find the parenthesis, collect the arguments and return the buffer containing them. PRAGMA_BUFF - argument is the same as in collect_args. */ + argument is the same as in collect_args. If NUM_ARGS is non-NULL, + *NUM_ARGS is set to the number of arguments contained in the + returned buffer. */ static _cpp_buff * funlike_invocation_p (cpp_reader *pfile, cpp_hashnode *node, - _cpp_buff **pragma_buff) + _cpp_buff **pragma_buff, unsigned *num_args) { const cpp_token *token, *padding = NULL; @@ -785,7 +934,7 @@ funlike_invocation_p (cpp_reader *pfile, cpp_hashnode *node, if (token->type == CPP_OPEN_PAREN) { pfile->state.parsing_args = 2; - return collect_args (pfile, node, pragma_buff); + return collect_args (pfile, node, pragma_buff, num_args); } /* CPP_EOF can be the end of macro arguments, or the end of the @@ -819,13 +968,15 @@ macro_real_token_count (const cpp_macro *macro) /* Push the context of a macro with hash entry NODE onto the context stack. If we can successfully expand the macro, we push a context containing its yet-to-be-rescanned replacement list and return one. - If there were additionally any unexpanded deferred #pragma directives - among macro arguments, push another context containing the - pragma tokens before the yet-to-be-rescanned replacement list - and return two. Otherwise, we don't push a context and return zero. */ + If there were additionally any unexpanded deferred #pragma + directives among macro arguments, push another context containing + the pragma tokens before the yet-to-be-rescanned replacement list + and return two. Otherwise, we don't push a context and return + zero. LOCATION is the location of the expansion point of the + macro. */ static int enter_macro_context (cpp_reader *pfile, cpp_hashnode *node, - const cpp_token *result) + const cpp_token *result, source_location location) { /* The presence of a macro invalidates a file's controlling macro. */ pfile->mi_valid = false; @@ -850,11 +1001,13 @@ enter_macro_context (cpp_reader *pfile, cpp_hashnode *node, if (macro->fun_like) { _cpp_buff *buff; + unsigned num_args = 0; pfile->state.prevent_expansion++; pfile->keep_tokens++; pfile->state.parsing_args = 1; - buff = funlike_invocation_p (pfile, node, &pragma_buff); + buff = funlike_invocation_p (pfile, node, &pragma_buff, + &num_args); pfile->state.parsing_args = 0; pfile->keep_tokens--; pfile->state.prevent_expansion--; @@ -873,8 +1026,13 @@ enter_macro_context (cpp_reader *pfile, cpp_hashnode *node, } if (macro->paramc > 0) - replace_args (pfile, node, macro, (macro_arg *) buff->base); - _cpp_release_buff (pfile, buff); + replace_args (pfile, node, macro, + (macro_arg *) buff->base, + location); + /* Free the memory used by the arguments of this + function-like macro. This memory has been allocated by + funlike_invocation_p and by replace_args. */ + delete_macro_args (buff, num_args); } /* Disable the macro within its expansion. */ @@ -888,13 +1046,44 @@ enter_macro_context (cpp_reader *pfile, cpp_hashnode *node, } if (pfile->cb.used) - pfile->cb.used (pfile, result->src_loc, node); + pfile->cb.used (pfile, location, node); macro->used = 1; if (macro->paramc == 0) - _cpp_push_token_context (pfile, node, macro->exp.tokens, - macro_real_token_count (macro)); + { + if (CPP_OPTION (pfile, track_macro_expansion)) + { + unsigned int i, count = macro->count; + const cpp_token *src = macro->exp.tokens; + const struct line_map *map; + source_location *virt_locs = NULL; + _cpp_buff *macro_tokens = + tokens_buff_new (pfile, count, &virt_locs); + + /* Create a macro map to record the locations of the + tokens that are involved in the expansion. LOCATION + is the location of the macro expansion point. */ + map = linemap_enter_macro (pfile->line_table, + node, location, count); + for (i = 0; i < count; ++i) + { + tokens_buff_add_token (macro_tokens, virt_locs, + src, src->src_loc, + src->src_loc, map, i); + ++src; + } + push_extended_tokens_context (pfile, node, + macro_tokens, + virt_locs, + (const cpp_token **) + macro_tokens->base, + count); + } + else + _cpp_push_token_context (pfile, node, macro->exp.tokens, + macro_real_token_count (macro)); + } if (pragma_buff) { @@ -922,33 +1111,311 @@ enter_macro_context (cpp_reader *pfile, cpp_hashnode *node, return builtin_macro (pfile, node); } +/* De-allocate the memory used by BUFF which is an array of instances + of macro_arg. NUM_ARGS is the number of instances of macro_arg + present in BUFF. */ +static void +delete_macro_args (_cpp_buff *buff, unsigned num_args) +{ + macro_arg *macro_args; + unsigned i; + + if (buff == NULL) + return; + + macro_args = (macro_arg *) buff->base; + + /* Walk instances of macro_arg to free their expanded tokens as well + as their macro_arg::virt_locs members. */ + for (i = 0; i < num_args; ++i) + { + if (macro_args[i].expanded) + { + free (macro_args[i].expanded); + macro_args[i].expanded = NULL; + } + if (macro_args[i].virt_locs) + { + free (macro_args[i].virt_locs); + macro_args[i].virt_locs = NULL; + } + if (macro_args[i].expanded_virt_locs) + { + free (macro_args[i].expanded_virt_locs); + macro_args[i].expanded_virt_locs = NULL; + } + } + _cpp_free_buff (buff); +} + +/* Set the INDEXth token of the macro argument ARG. TOKEN is the token + to set, LOCATION is its virtual location. "Virtual" location means + the location that encodes loci accross macro expansion. Otherwise + it has to be TOKEN->SRC_LOC. KIND is the kind of tokens the + argument ARG is supposed to contain. Note that ARG must be + tailored so that it has enough room to contain INDEX + 1 numbers of + tokens, at least. */ +static void +set_arg_token (macro_arg *arg, const cpp_token *token, + source_location location, size_t index, + enum macro_arg_token_kind kind, + bool track_macro_exp_p) +{ + const cpp_token **token_ptr; + source_location *loc = NULL; + + token_ptr = + arg_token_ptr_at (arg, index, kind, + track_macro_exp_p ? &loc : NULL); + *token_ptr = token; + + if (loc != NULL) + { +#ifdef ENABLE_CHECKING + if (kind == MACRO_ARG_TOKEN_STRINGIFIED + || !track_macro_exp_p) + /* We can't set the location of a stringified argument + token and we can't set any location if we aren't tracking + macro expansion locations. */ + abort (); +#endif + *loc = location; + } +} + +/* Get the pointer to the location of the argument token of the + function-like macro argument ARG. */ +static const source_location * +get_arg_token_location (const macro_arg *arg, + enum macro_arg_token_kind kind, + bool track_macro_exp_p) +{ + const source_location *loc = NULL; + const cpp_token **token_ptr = + arg_token_ptr_at (arg, 0, kind, + track_macro_exp_p ? (source_location **)&loc : NULL); + + if (token_ptr == NULL) + return NULL; + + if (loc == NULL) + loc = &(*token_ptr)->src_loc; + return loc; +} + +/* Return the pointer to the INDEXth token of the macro argument ARG. + KIND specifies the kind of token the macro argument ARG contains. + If VIRT_LOCATION is non NULL, *VIRT_LOCATION is set to the address + of the virtual location of the returned token if the + -ftrack-macro-expansion flag is on; otherwise, it's set to the + spelling location of the returned token. */ +static const cpp_token ** +arg_token_ptr_at (const macro_arg *arg, size_t index, + enum macro_arg_token_kind kind, + source_location **virt_location) +{ + const cpp_token **tokens_ptr = NULL; + + switch (kind) + { + case MACRO_ARG_TOKEN_NORMAL: + tokens_ptr = arg->first; + break; + case MACRO_ARG_TOKEN_STRINGIFIED: + tokens_ptr = (const cpp_token **) &arg->stringified; + break; + case MACRO_ARG_TOKEN_EXPANDED: + tokens_ptr = arg->expanded; + break; + } + + if (tokens_ptr == NULL) + /* This can happen for e.g, an empty token argument to a + funtion-like macro. */ + return tokens_ptr; + + if (virt_location) + { + if (kind == MACRO_ARG_TOKEN_NORMAL) + *virt_location = &arg->virt_locs[index]; + else if (kind == MACRO_ARG_TOKEN_EXPANDED) + *virt_location = &arg->expanded_virt_locs[index]; + else if (kind == MACRO_ARG_TOKEN_STRINGIFIED) + *virt_location = + (source_location *) &tokens_ptr[index]->src_loc; + } + return &tokens_ptr[index]; +} + +/* Initialize an iterator so that it iterates over the tokens of a + function-like macro argument. KIND is the kind of tokens we want + ITER to iterate over. TOKEN_PTR points the first token ITER will + iterate over. */ +static void +macro_arg_token_iter_init (macro_arg_token_iter *iter, + bool track_macro_exp_p, + enum macro_arg_token_kind kind, + const macro_arg *arg, + const cpp_token **token_ptr) +{ + iter->track_macro_exp_p = track_macro_exp_p; + iter->kind = kind; + iter->token_ptr = token_ptr; + iter->location_ptr = get_arg_token_location (arg, kind, track_macro_exp_p); +#ifdef ENABLE_CHECKING + iter->num_forwards = 0; + if (token_ptr != NULL && iter->location_ptr == NULL) + abort (); +#endif +} + +/* Move the iterator one token forward. Note that if IT was + initialized on an argument that has a stringified token, moving it + foward doesn't make sense as a stringified token is essentially one + string. */ +static void +macro_arg_token_iter_forward (macro_arg_token_iter *it) +{ + switch (it->kind) + { + case MACRO_ARG_TOKEN_NORMAL: + case MACRO_ARG_TOKEN_EXPANDED: + it->token_ptr++; + if (it->track_macro_exp_p) + it->location_ptr++; + break; + case MACRO_ARG_TOKEN_STRINGIFIED: +#ifdef ENABLE_CHECKING + if (it->num_forwards > 0) + abort (); +#endif + break; + } + +#ifdef ENABLE_CHECKING + it->num_forwards++; +#endif +} + +/* Return the token pointed to by the iterator. */ +static const cpp_token * +macro_arg_token_iter_get_token (const macro_arg_token_iter *it) +{ +#ifdef ENABLE_CHECKING + if (it->kind == MACRO_ARG_TOKEN_STRINGIFIED + && it->num_forwards > 0) + abort (); +#endif + if (it->token_ptr == NULL) + return NULL; + return *it->token_ptr; +} + +/* Return the location of the token pointed to by the iterator.*/ +static source_location +macro_arg_token_iter_get_location (const macro_arg_token_iter *it) +{ +#ifdef ENABLE_CHECKING + if (it->kind == MACRO_ARG_TOKEN_STRINGIFIED + && it->num_forwards > 0) + abort (); +#endif + return *it->location_ptr; +} + +/* Return the index of a token [resulting from macro expansion] inside + the total list of tokens resulting from a given macro + expansion. The index can be different depending on whether if we + want each tokens resulting from function-like macro arguments + expansion to have a different location or not. + + E.g, consider this function-like macro: + + #define M(x) x - 3 + + Then consider us "calling" it (and thus expanding it) like: + + M(1+4) + + It will be expanded into: + + 1+4-3 + + Let's consider the case of the token '4'. + + Its index can be 2 (it's the third token of the set of tokens + resulting from the expansion) or it can be 0 if we consider that + all tokens resulting from the expansion of the argument "1+2" have + the same index, which is 0. In this later case, the index of token + '-' would then be 1 and the index of token '3' would be 2. + + The later case is useful to use less memory e.g, for the case of + the user using the option -ftrack-macro-expansion=1. + + ABSOLUTE_TOKEN_INDEX is the index of the macro argument token we + are interested in. CUR_REPLACEMENT_TOKEN is the token of the macro + parameter (inside the macro replacement list) that corresponds to + the macro argument for which ABSOLUTE_TOKEN_INDEX is a token index + of. + + If we refer to the example above, for the '4' argument token, + ABSOLUTE_TOKEN_INDEX would be set to 2, and CUR_REPLACEMENT_TOKEN + would be set to the token 'x', in the replacement list "x - 3" of + macro M. + + This is a subroutine of replace_args. */ +inline static unsigned +expanded_token_index (cpp_reader *pfile, cpp_macro *macro, + const cpp_token *cur_replacement_token, + unsigned absolute_token_index) +{ + if (CPP_OPTION (pfile, track_macro_expansion) > 1) + return absolute_token_index; + return cur_replacement_token - macro->exp.tokens; +} + /* Replace the parameters in a function-like macro of NODE with the actual ARGS, and place the result in a newly pushed token context. Expand each argument before replacing, unless it is operated upon - by the # or ## operators. */ + by the # or ## operators. EXPANSION_POINT_LOC is the location of + the expansion point of the macro. E.g, the location of the + function-like macro invocation. */ static void -replace_args (cpp_reader *pfile, cpp_hashnode *node, cpp_macro *macro, macro_arg *args) +replace_args (cpp_reader *pfile, cpp_hashnode *node, cpp_macro *macro, + macro_arg *args, source_location expansion_point_loc) { unsigned int i, total; const cpp_token *src, *limit; - const cpp_token **dest, **first; + const cpp_token **first = NULL; macro_arg *arg; - _cpp_buff *buff; - unsigned int count; + _cpp_buff *buff = NULL; + source_location *virt_locs = NULL; + unsigned int exp_count; + const struct line_map *map = NULL; + int track_macro_exp; /* First, fully macro-expand arguments, calculating the number of tokens in the final expansion as we go. The ordering of the if statements below is subtle; we must handle stringification before pasting. */ - count = macro_real_token_count (macro); - total = count; - limit = macro->exp.tokens + count; + + /* EXP_COUNT is the number of tokens in the macro replacement + list. TOTAL is the number of tokens /after/ macro parameters + have been replaced by their arguments. */ + exp_count = macro_real_token_count (macro); + total = exp_count; + limit = macro->exp.tokens + exp_count; for (src = macro->exp.tokens; src < limit; src++) if (src->type == CPP_MACRO_ARG) { /* Leading and trailing padding tokens. */ total += 2; + /* Account for leading and padding tokens in exp_count too. + This is going to be important later down this function, + when we want to handle the case of (track_macro_exp < + 2). */ + exp_count += 2; /* We have an argument. If it is not being stringified or pasted it is macro-replaced before insertion. */ @@ -970,67 +1437,230 @@ replace_args (cpp_reader *pfile, cpp_hashnode *node, cpp_macro *macro, macro_arg } } - /* Now allocate space for the expansion, copy the tokens and replace - the arguments. */ - buff = _cpp_get_buff (pfile, total * sizeof (cpp_token *)); + /* When the compiler is called with the -ftrack-macro-expansion + flag, we need to keep track of the location of each token that + results from macro expansion. + + A token resulting from macro expansion is not a new token. It is + simply the same token as the token coming from the macro + definition. The new things that are allocated are the buffer + that holds the tokens resulting from macro expansion and a new + location that records many things like the locus of the expansion + point as well as the original locus inside the definition of the + macro. This location is called a virtual location. + + So the buffer BUFF holds a set of cpp_token*, and the buffer + VIRT_LOCS holds the virtual locations of the tokens held by BUFF. + + Both of these two buffers are going to be hung off of the macro + context, when the latter is pushed. The memory allocated to + store the tokens and their locations is going to be freed once + the context of macro expansion is popped. + + As far as tokens are concerned, the memory overhead of + -ftrack-macro-expansion is proportional to the number of + macros that get expanded multiplied by sizeof (source_location). + The good news is that extra memory gets freed when the macro + context is freed, i.e shortly after the macro got expanded. */ + + /* Is the -ftrack-macro-expansion flag in effect? */ + track_macro_exp = CPP_OPTION (pfile, track_macro_expansion); + + /* Now allocate memory space for tokens and locations resulting from + the macro expansion, copy the tokens and replace the arguments. + This memory must be freed when the context of the macro MACRO is + popped. */ + buff = tokens_buff_new (pfile, total, track_macro_exp ? &virt_locs : NULL); + first = (const cpp_token **) buff->base; - dest = first; + /* Create a macro map to record the locations of the tokens that are + involved in the expansion. Note that the expansion point is set + to the location of the closing parenthesis. Otherwise, the + subsequent map created for the first token that comes after the + macro map might have a wrong line number. That would lead to + tokens with wrong line numbers after the macro expansion. This + adds up to the memory overhead of the -ftrack-macro-expansion + flag; for every macro that is expanded, a "macro map" is + created. */ + if (track_macro_exp) + { + int num_macro_tokens = total; + if (track_macro_exp < 2) + /* Then the number of macro tokens won't take in account the + fact that function-like macro arguments can expand to + multiple tokens. This is to save memory at the expense of + accuracy. + + Suppose we have #define SQARE(A) A * A + + And then we do SQARE(2+3) + + Then the tokens 2, +, 3, will have the same location, + saying they come from the expansion of the argument A. */ + num_macro_tokens = exp_count; + map = linemap_enter_macro (pfile->line_table, node, + expansion_point_loc, + num_macro_tokens); + } + i = 0; for (src = macro->exp.tokens; src < limit; src++) { - unsigned int count; - const cpp_token **from, **paste_flag; + unsigned int arg_tokens_count; + macro_arg_token_iter from; + const cpp_token **paste_flag = NULL; + const cpp_token **tmp_token_ptr; if (src->type != CPP_MACRO_ARG) { - *dest++ = src; + /* Allocate a virtual location for token SRC, and add that + token and its virtual location into the buffers BUFF and + VIRT_LOCS. */ + unsigned index = expanded_token_index (pfile, macro, src, i); + tokens_buff_add_token (buff, virt_locs, src, + src->src_loc, src->src_loc, + map, index); + i += 1; continue; } paste_flag = 0; arg = &args[src->val.macro_arg.arg_no - 1]; + /* SRC is a macro parameter that we need to replace with its + corresponding argument. So at some point we'll need to + iterate over the tokens of the macro argument and copy them + into the "place" now holding the correspondig macro + parameter. We are going to use the iterator type + macro_argo_token_iter to handle that iterating. The 'if' + below is to initialize the iterator depending on the type of + tokens the macro argument has. It also does some adjustment + related to padding tokens and some pasting corner cases. */ if (src->flags & STRINGIFY_ARG) - count = 1, from = &arg->stringified; + { + arg_tokens_count = 1; + macro_arg_token_iter_init (&from, + CPP_OPTION (pfile, + track_macro_expansion), + MACRO_ARG_TOKEN_STRINGIFIED, + arg, &arg->stringified); + } else if (src->flags & PASTE_LEFT) - count = arg->count, from = arg->first; + { + arg_tokens_count = arg->count; + macro_arg_token_iter_init (&from, + CPP_OPTION (pfile, + track_macro_expansion), + MACRO_ARG_TOKEN_NORMAL, + arg, arg->first); + } else if (src != macro->exp.tokens && (src[-1].flags & PASTE_LEFT)) { - count = arg->count, from = arg->first; - if (dest != first) + int num_toks; + arg_tokens_count = arg->count; + macro_arg_token_iter_init (&from, + CPP_OPTION (pfile, + track_macro_expansion), + MACRO_ARG_TOKEN_NORMAL, + arg, arg->first); + + num_toks = tokens_buff_count (buff); + + if (num_toks != 0) { - if (dest[-1]->type == CPP_COMMA + /* So the current parameter token is pasted to the previous + token in the replacement list. Let's look at what + we have as previous and current arguments. */ + + /* This is the previous argument's token ... */ + tmp_token_ptr = tokens_buff_last_token_ptr (buff); + + if ((*tmp_token_ptr)->type == CPP_COMMA && macro->variadic && src->val.macro_arg.arg_no == macro->paramc) { - /* Swallow a pasted comma if from == NULL, otherwise - drop the paste flag. */ - if (from == NULL) - dest--; + /* ... which is a comma; and the current parameter + is the last parameter of a variadic function-like + macro. If the argument to the current last + parameter is NULL, then swallow the comma, + otherwise drop the paste flag. */ + if (macro_arg_token_iter_get_token (&from) == NULL) + tokens_buff_remove_last_token (buff); else - paste_flag = dest - 1; + paste_flag = tmp_token_ptr; } /* Remove the paste flag if the RHS is a placemarker. */ - else if (count == 0) - paste_flag = dest - 1; + else if (arg_tokens_count == 0) + paste_flag = tmp_token_ptr; } } else - count = arg->expanded_count, from = arg->expanded; + { + arg_tokens_count = arg->expanded_count; + macro_arg_token_iter_init (&from, + CPP_OPTION (pfile, + track_macro_expansion), + MACRO_ARG_TOKEN_EXPANDED, + arg, arg->expanded); + } /* Padding on the left of an argument (unless RHS of ##). */ if ((!pfile->state.in_directive || pfile->state.directive_wants_padding) && src != macro->exp.tokens && !(src[-1].flags & PASTE_LEFT)) - *dest++ = padding_token (pfile, src); + { + const cpp_token *t = padding_token (pfile, src); + unsigned index = expanded_token_index (pfile, macro, src, i); + /* Allocate a virtual location for the padding token and + append the token and its location to BUFF and + VIRT_LOCS. */ + tokens_buff_add_token (buff, virt_locs, t, + t->src_loc, t->src_loc, + map, index); + } - if (count) + if (arg_tokens_count) { - memcpy (dest, from, count * sizeof (cpp_token *)); - dest += count; + /* So now we've got the number of tokens that make up the + argument that is going to replace the current parameter + in the macro's replacement list. */ + unsigned int j; + for (j = 0; j < arg_tokens_count; ++j) + { + /* So if track_macro_exp is < 2, the user wants to + save extra memory while tracking macro expansion + locations. So in that case here is what we do: + + Suppose we have #define SQARE(A) A * A + + And then we do SQARE(2+3) + + Then the tokens 2, +, 3, will have the same location, + saying they come from the expansion of the argument + A. + + So that means we are going to ignore the COUNT tokens + resulting from the expansion of the current macro + arugment. In other words all the ARG_TOKENS_COUNT tokens + resulting from the expansion of the macro argument will + have the index I. Normally, each of those token should + have index I+J. */ + unsigned token_index = i; + unsigned index; + if (track_macro_exp > 1) + token_index += j; + + index = expanded_token_index (pfile, macro, src, token_index); + tokens_buff_add_token (buff, virt_locs, + macro_arg_token_iter_get_token (&from), + macro_arg_token_iter_get_location (&from), + src->src_loc, map, index); + macro_arg_token_iter_forward (&from); + } /* With a non-empty argument on the LHS of ##, the last token should be flagged PASTE_LEFT. */ if (src->flags & PASTE_LEFT) - paste_flag = dest - 1; + paste_flag = + (const cpp_token **) tokens_buff_last_token_ptr (buff); } else if (CPP_PEDANTIC (pfile) && ! macro->syshdr && ! CPP_OPTION (pfile, c99) @@ -1046,7 +1676,12 @@ replace_args (cpp_reader *pfile, cpp_hashnode *node, cpp_macro *macro, macro_arg /* Avoid paste on RHS (even case count == 0). */ if (!pfile->state.in_directive && !(src->flags & PASTE_LEFT)) - *dest++ = &pfile->avoid_paste; + { + const cpp_token *t = &pfile->avoid_paste; + tokens_buff_add_token (buff, virt_locs, + t, t->src_loc, t->src_loc, + NULL, 0); + } /* Add a new paste flag, or remove an unwanted one. */ if (paste_flag) @@ -1060,13 +1695,16 @@ replace_args (cpp_reader *pfile, cpp_hashnode *node, cpp_macro *macro, macro_arg token->flags = (*paste_flag)->flags & ~PASTE_LEFT; *paste_flag = token; } - } - /* Free the expanded arguments. */ - for (i = 0; i < macro->paramc; i++) - free (args[i].expanded); + i += arg_tokens_count; + } - push_ptoken_context (pfile, node, buff, first, dest - first); + if (track_macro_exp) + push_extended_tokens_context (pfile, node, buff, virt_locs, first, + tokens_buff_count (buff)); + else + push_ptoken_context (pfile, node, buff, first, + tokens_buff_count (buff)); } /* Return a special padding token, with padding inherited from SOURCE. */ @@ -1094,6 +1732,7 @@ next_context (cpp_reader *pfile) if (result == 0) { result = XNEW (cpp_context); + memset (result, 0, sizeof (cpp_context)); result->prev = pfile->context; result->next = 0; pfile->context->next = result; @@ -1110,8 +1749,8 @@ push_ptoken_context (cpp_reader *pfile, cpp_hashnode *macro, _cpp_buff *buff, { cpp_context *context = next_context (pfile); - context->direct_p = false; - context->macro = macro; + context->tokens_kind = TOKENS_KIND_INDIRECT; + context->c.macro = macro; context->buff = buff; FIRST (context).ptoken = first; LAST (context).ptoken = first + count; @@ -1122,15 +1761,44 @@ void _cpp_push_token_context (cpp_reader *pfile, cpp_hashnode *macro, const cpp_token *first, unsigned int count) { - cpp_context *context = next_context (pfile); - - context->direct_p = true; - context->macro = macro; - context->buff = NULL; + cpp_context *context = next_context (pfile); + + context->tokens_kind = TOKENS_KIND_DIRECT; + context->c.macro = macro; + context->buff = NULL; FIRST (context).token = first; LAST (context).token = first + count; } +/* Build a context containing a list of tokens as well as their + virtual locations and push it. TOKENS_BUFF is the buffer that + contains the tokens pointed to by FIRST. If TOKENS_BUFF is + non-NULL, it means that the context owns it, meaning that + _cpp_pop_context will free it as well as VIRT_LOCS_BUFF that + contains the virtual locations. */ +static void +push_extended_tokens_context (cpp_reader *pfile, + cpp_hashnode *macro, + _cpp_buff *token_buff, + source_location *virt_locs, + const cpp_token **first, + unsigned int count) +{ + cpp_context *context = next_context (pfile); + macro_context *m; + + context->tokens_kind = TOKENS_KIND_EXTENDED; + context->buff = token_buff; + + m = XNEW (macro_context); + m->macro_node = macro; + m->virt_locs = virt_locs; + m->cur_virt_loc = virt_locs; + context->c.mc = m; + FIRST (context).ptoken = first; + LAST (context).ptoken = first + count; +} + /* Push a traditional macro's replacement text. */ void _cpp_push_text_context (cpp_reader *pfile, cpp_hashnode *macro, @@ -1138,14 +1806,200 @@ _cpp_push_text_context (cpp_reader *pfile, cpp_hashnode *macro, { cpp_context *context = next_context (pfile); - context->direct_p = true; - context->macro = macro; + context->tokens_kind = TOKENS_KIND_DIRECT; + context->c.macro = macro; context->buff = NULL; CUR (context) = start; RLIMIT (context) = start + len; macro->flags |= NODE_DISABLED; } +/* Creates a buffer that holds tokens a.k.a "token buffer", usually + for the purpose of storing them on a cpp_context. If VIRT_LOCS is + non-null (which means that -ftrack-macro-expansion is on), + *VIRT_LOCS is set to a newly allocated buffer that is supposed to + hold the virtual locations of the tokens resulting from macro + expansion. */ +static _cpp_buff* +tokens_buff_new (cpp_reader *pfile, size_t len, + source_location **virt_locs) +{ + size_t tokens_size = len * sizeof (cpp_token *); + size_t locs_size = len * sizeof (source_location); + + if (virt_locs != NULL) + *virt_locs = XNEWVEC (source_location, locs_size); + return _cpp_get_buff (pfile, tokens_size); +} + +/* Returns the number of tokens contained in a token buffer. The + buffer holds a set of cpp_token*. */ +static size_t +tokens_buff_count (_cpp_buff *buff) +{ + return (BUFF_FRONT (buff) - buff->base) / sizeof (cpp_token *); +} + +/* Return a pointer to the last token contained in the token buffer + BUFF. */ +static const cpp_token ** +tokens_buff_last_token_ptr (_cpp_buff *buff) +{ + return &((const cpp_token **) BUFF_FRONT (buff))[-1]; +} + +/* Remove the last token contained in the token buffer TOKENS_BUFF. + If VIRT_LOCS_BUFF is non-NULL, it should point at the buffer + containing the virtual locations of the tokens in TOKENS_BUFF; in + which case the function updates that buffer as well. */ +static inline void +tokens_buff_remove_last_token (_cpp_buff *tokens_buff) + +{ + if (BUFF_FRONT (tokens_buff) > tokens_buff->base) + BUFF_FRONT (tokens_buff) = + (unsigned char *) &((cpp_token **) BUFF_FRONT (tokens_buff))[-1]; +} + +/* Insert a token into the token buffer at the position pointed to by + DEST. Note that the buffer is not enlarged so the previous token + that was at *DEST is overwritten. VIRT_LOC_DEST, if non-null, + means -ftrack-macro-expansion is effect; it then points to where to + insert the virtual location of TOKEN. TOKEN is the token to + insert. VIRT_LOC is the virtual location of the token, i.e, the + location possibly encoding its locus accross macro expansion. If + TOKEN is an argument of a function-like macro (inside a macro + replacement list), PARM_DEF_LOC is the spelling location of the + macro parameter that TOKEN is replacing, in the replacement list of + the macro. If TOKEN is not an argument of a function-like macro or + if it doesn't come from a macro expansion, then VIRT_LOC can just + be set to the same value as PARM_DEF_LOC. If MAP is non null, it + means TOKEN comes from a macro expansion and MAP is the macro map + associated to the macro. MACRO_TOKEN_INDEX points to the index of + the token in the macro map; it is not considered if MAP is NULL. + + Upon successful completion this function returns the a pointer to + the position of the token coming right after the insertion + point. */ +static inline const cpp_token ** +tokens_buff_put_token_to (const cpp_token **dest, + source_location *virt_loc_dest, + const cpp_token *token, + source_location virt_loc, + source_location parm_def_loc, + const struct line_map *map, + unsigned int macro_token_index) +{ + source_location macro_loc = virt_loc; + const cpp_token **result; + + if (virt_loc_dest) + { + /* -ftrack-macro-expansion is on. */ + if (map) + macro_loc = linemap_add_macro_token (map, macro_token_index, + virt_loc, parm_def_loc); + *virt_loc_dest = macro_loc; + } + *dest = token; + result = &dest[1]; + + return result; +} + +/* Adds a token at the end of the tokens contained in BUFFER. Note + that this function doesn't enlarge BUFFER when the number of tokens + reaches BUFFER's size; it aborts in that situation. + + TOKEN is the token to append. VIRT_LOC is the virtual location of + the token, i.e, the location possibly encoding its locus accross + macro expansion. If TOKEN is an argument of a function-like macro + (inside a macro replacement list), PARM_DEF_LOC is the location of + the macro parameter that TOKEN is replacing. If TOKEN doesn't come + from a macro expansion, then VIRT_LOC can just be set to the same + value as PARM_DEF_LOC. If MAP is non null, it means TOKEN comes + from a macro expansion and MAP is the macro map associated to the + macro. MACRO_TOKEN_INDEX points to the index of the token in the + macro map; It is not considered if MAP is NULL. If VIRT_LOCS is + non-null, it means -ftrack-macro-expansion is on; in which case + this function adds the virtual location DEF_LOC to the VIRT_LOCS + array, at the same index as the one of TOKEN in BUFFER. Upon + successful completion this function returns the a pointer to the + position of the token coming right after the insertion point. */ +static const cpp_token ** +tokens_buff_add_token (_cpp_buff *buffer, + source_location *virt_locs, + const cpp_token *token, + source_location virt_loc, + source_location parm_def_loc, + const struct line_map *map, + unsigned int macro_token_index) +{ + const cpp_token **result; + source_location *virt_loc_dest = NULL; + unsigned token_index = + (BUFF_FRONT (buffer) - buffer->base) / sizeof (cpp_token *); + + /* Abort if we pass the end the buffer. */ + if (BUFF_FRONT (buffer) > BUFF_LIMIT (buffer)) + abort (); + + if (virt_locs != NULL) + virt_loc_dest = &virt_locs[token_index]; + + result = + tokens_buff_put_token_to ((const cpp_token **) BUFF_FRONT (buffer), + virt_loc_dest, token, virt_loc, parm_def_loc, + map, macro_token_index); + + BUFF_FRONT (buffer) = (unsigned char *) result; + return result; +} + +/* Allocate space for the function-like macro argument ARG to store + the tokens resulting from the macro-expansion of the tokens that + make up ARG itself. That space is allocated in ARG->expanded and + needs to be freed using free. */ +static void +alloc_expanded_arg_mem (cpp_reader *pfile, macro_arg *arg, size_t capacity) +{ +#ifdef ENABLE_CHECKING + if (arg->expanded != NULL + || arg->expanded_virt_locs != NULL) + abort (); +#endif + arg->expanded = XNEWVEC (const cpp_token *, capacity); + if (CPP_OPTION (pfile, track_macro_expansion)) + arg->expanded_virt_locs = XNEWVEC (source_location, capacity); + +} + +/* If necessary, enlarge ARG->expanded to so that it can contain SIZE + tokens. */ +static void +ensure_expanded_arg_room (cpp_reader *pfile, macro_arg *arg, + size_t size, size_t *expanded_capacity) +{ + if (size <= *expanded_capacity) + return; + + size *= 2; + + arg->expanded = + XRESIZEVEC (const cpp_token *, arg->expanded, size); + *expanded_capacity = size; + + if (CPP_OPTION (pfile, track_macro_expansion)) + { + if (arg->expanded_virt_locs == NULL) + arg->expanded_virt_locs = XNEWVEC (source_location, size); + else + arg->expanded_virt_locs = XRESIZEVEC (source_location, + arg->expanded_virt_locs, + size); + } +} + /* Expand an argument ARG before replacing parameters in a function-like macro. This works by pushing a context with the argument's tokens, and then expanding that into a temporary buffer @@ -1155,38 +2009,48 @@ _cpp_push_text_context (cpp_reader *pfile, cpp_hashnode *macro, static void expand_arg (cpp_reader *pfile, macro_arg *arg) { - unsigned int capacity; + size_t capacity; bool saved_warn_trad; + bool track_macro_exp_p = CPP_OPTION (pfile, track_macro_expansion); - if (arg->count == 0) + if (arg->count == 0 + || arg->expanded != NULL) return; /* Don't warn about funlike macros when pre-expanding. */ saved_warn_trad = CPP_WTRADITIONAL (pfile); CPP_WTRADITIONAL (pfile) = 0; - /* Loop, reading in the arguments. */ + /* Loop, reading in the tokens of the argument. */ capacity = 256; - arg->expanded = XNEWVEC (const cpp_token *, capacity); + alloc_expanded_arg_mem (pfile, arg, capacity); + + if (track_macro_exp_p) + push_extended_tokens_context (pfile, NULL, NULL, + arg->virt_locs, + arg->first, + arg->count + 1); + else + push_ptoken_context (pfile, NULL, NULL, + arg->first, arg->count + 1); - push_ptoken_context (pfile, NULL, NULL, arg->first, arg->count + 1); for (;;) { const cpp_token *token; + source_location location; - if (arg->expanded_count + 1 >= capacity) - { - capacity *= 2; - arg->expanded = XRESIZEVEC (const cpp_token *, arg->expanded, - capacity); - } + ensure_expanded_arg_room (pfile, arg, arg->expanded_count + 1, + &capacity); - token = cpp_get_token (pfile); + token = cpp_get_token_1 (pfile, &location); if (token->type == CPP_EOF) break; - arg->expanded[arg->expanded_count++] = token; + set_arg_token (arg, token, location, + arg->expanded_count, MACRO_ARG_TOKEN_EXPANDED, + CPP_OPTION (pfile, track_macro_expansion)); + arg->expanded_count++; } _cpp_pop_context (pfile); @@ -1195,25 +2059,132 @@ expand_arg (cpp_reader *pfile, macro_arg *arg) } /* Pop the current context off the stack, re-enabling the macro if the - context represented a macro's replacement list. The context - structure is not freed so that we can re-use it later. */ + context represented a macro's replacement list. Initially the + context structure was not freed so that we can re-use it later, but + now we do free it to reduce peak memory consumption. */ void _cpp_pop_context (cpp_reader *pfile) { cpp_context *context = pfile->context; - if (context->macro) - context->macro->flags &= ~NODE_DISABLED; + if (context->c.macro) + { + cpp_hashnode *macro; + if (context->tokens_kind == TOKENS_KIND_EXTENDED) + { + macro_context *mc = context->c.mc; + macro = mc->macro_node; + /* If context->buff is set, it means the life time of tokens + is bound to the life time of this context; so we must + free the tokens; that means we must free the virtual + locations of these tokens too. */ + if (context->buff && mc->virt_locs) + { + free (mc->virt_locs); + mc->virt_locs = NULL; + } + free (mc); + context->c.mc = NULL; + } + else + macro = context->c.macro; + + /* Beware that MACRO can be NULL in cases like when we are + called from expand_arg. In those cases, a dummy context with + tokens is pushed just for the purpose of walking them using + cpp_get_token_1. In that case, no 'macro' field is set into + the dummy context. */ + if (macro != NULL) + macro->flags &= ~NODE_DISABLED; + } if (context->buff) - _cpp_release_buff (pfile, context->buff); + { + /* Decrease memory peak consumption by freeing the memory used + by the context. */ + _cpp_free_buff (context->buff); + } pfile->context = context->prev; + /* decrease peak memory consumption by feeing the context. */ + pfile->context->next = NULL; + free (context); } -/* External routine to get a token. Also used nearly everywhere - internally, except for places where we know we can safely call - _cpp_lex_token directly, such as lexing a directive name. +/* Return TRUE if we reached the end of the set of tokens stored in + CONTEXT, FALSE otherwise. */ +static inline bool +reached_end_of_context (cpp_context *context) +{ + if (context->tokens_kind == TOKENS_KIND_DIRECT) + return FIRST (context).token == LAST (context).token; + else if (context->tokens_kind == TOKENS_KIND_INDIRECT + || context->tokens_kind == TOKENS_KIND_EXTENDED) + return FIRST (context).ptoken == LAST (context).ptoken; + else + abort (); +} + +/* Consume the next token contained in the current context of PFILE, + and return it in *TOKEN. It's "full location" is returned in + *LOCATION. If -ftrack-macro-location is in effeect, fFull location" + means the location encoding the locus of the token accross macro + expansion; otherwise it's just is the "normal" location of the + token which (*TOKEN)->src_loc. */ +static inline void +consume_next_token_from_context (cpp_reader *pfile, + const cpp_token ** token, + source_location *location) +{ + cpp_context *c = pfile->context; + + if ((c)->tokens_kind == TOKENS_KIND_DIRECT) + { + *token = FIRST (c).token; + *location = (*token)->src_loc; + FIRST (c).token++; + } + else if ((c)->tokens_kind == TOKENS_KIND_INDIRECT) + { + *token = *FIRST (c).ptoken; + *location = (*token)->src_loc; + FIRST (c).ptoken++; + } + else if ((c)->tokens_kind == TOKENS_KIND_EXTENDED) + { + macro_context *m = c->c.mc; + *token = *FIRST (c).ptoken; + if (m->virt_locs) + { + *location = *m->cur_virt_loc; + m->cur_virt_loc++; + } + else + *location = (*token)->src_loc; + FIRST (c).ptoken++; + } + else + abort (); +} + +/* In the traditional mode of the preprocessor, if we are currently in + a directive, the location of a token must be the location of the + start of the directive line. This function returns the proper + location if we are in the traditional mode, and just returns + LOCATION otherwise. */ + +static inline source_location +maybe_adjust_loc_for_trad_cpp (cpp_reader *pfile, source_location location) +{ + if (CPP_OPTION (pfile, traditional)) + { + if (pfile->state.in_directive) + return pfile->directive_line; + } + return location; +} + +/* Routine to get a token as well as its location. Macro expansions and directives are transparently handled, including entering included files. Thus tokens are post-macro @@ -1221,12 +2192,20 @@ _cpp_pop_context (cpp_reader *pfile) see CPP_EOF only at EOF. Internal callers also see it when meeting a directive inside a macro call, when at the end of a directive and state.in_directive is still 1, and at the end of argument - pre-expansion. */ -const cpp_token * -cpp_get_token (cpp_reader *pfile) + pre-expansion. + + LOC is an out parameter; *LOC is set to the location "as expected + by the user". Please read the comment of + cpp_get_token_with_location to learn more about the meaning of this + location. */ +static const cpp_token* +cpp_get_token_1 (cpp_reader *pfile, source_location *location) { const cpp_token *result; bool can_set = pfile->set_invocation_location; + /* This token is a virtual token that either encodes a location + related to macro expansion or a spelling location. */ + source_location virt_loc = 0; pfile->set_invocation_location = false; for (;;) @@ -1236,20 +2215,21 @@ cpp_get_token (cpp_reader *pfile) /* Context->prev == 0 <=> base context. */ if (!context->prev) - result = _cpp_lex_token (pfile); - else if (FIRST (context).token != LAST (context).token) { - if (context->direct_p) - result = FIRST (context).token++; - else - result = *FIRST (context).ptoken++; - + result = _cpp_lex_token (pfile); + virt_loc = result->src_loc; + } + else if (!reached_end_of_context (context)) + { + consume_next_token_from_context (pfile, &result, + &virt_loc); if (result->flags & PASTE_LEFT) { paste_all_tokens (pfile, result); if (pfile->state.in_directive) continue; - return padding_token (pfile, result); + result = padding_token (pfile, result); + goto out; } } else @@ -1257,7 +2237,8 @@ cpp_get_token (cpp_reader *pfile) _cpp_pop_context (pfile); if (pfile->state.in_directive) continue; - return &pfile->avoid_paste; + result = &pfile->avoid_paste; + goto out; } if (pfile->state.in_directive && result->type == CPP_COMMENT) @@ -1276,7 +2257,7 @@ cpp_get_token (cpp_reader *pfile) int ret = 0; /* If not in a macro context, and we're going to start an expansion, record the location. */ - if (can_set && !context->macro) + if (can_set && !context->c.macro) pfile->invocation_location = result->src_loc; if (pfile->state.prevent_expansion) break; @@ -1294,7 +2275,8 @@ cpp_get_token (cpp_reader *pfile) || (peek_tok->flags & PREV_WHITE)); node = pfile->cb.macro_to_expand (pfile, result); if (node) - ret = enter_macro_context (pfile, node, result); + ret = enter_macro_context (pfile, node, result, + virt_loc); else if (whitespace_after) { /* If macro_to_expand hook returned NULL and it @@ -1311,12 +2293,14 @@ cpp_get_token (cpp_reader *pfile) } } else - ret = enter_macro_context (pfile, node, result); + ret = enter_macro_context (pfile, node, result, + virt_loc); if (ret) { if (pfile->state.in_directive || ret == 2) continue; - return padding_token (pfile, result); + result = padding_token (pfile, result); + goto out; } } else @@ -1333,27 +2317,91 @@ cpp_get_token (cpp_reader *pfile) break; } + if (location) + { + if (virt_loc == 0) + virt_loc = result->src_loc; + *location = virt_loc; + } + + out: + if (location != NULL) + { + if (!CPP_OPTION (pfile, track_macro_expansion) + && can_set + && pfile->context->c.macro != NULL) + /* We are in a macro expansion context, are not tracking + virtual location, but were asked to report the location + of the expansion point of the macro being expanded. */ + *location = pfile->invocation_location; + + *location = maybe_adjust_loc_for_trad_cpp (pfile, *location); + } return result; } -/* Like cpp_get_token, but also returns a location separate from the - one provided by the returned token. LOC is an out parameter; *LOC - is set to the location "as expected by the user". This matters - when a token results from macro expansion -- the token's location - will indicate where the macro is defined, but *LOC will be the - location of the start of the expansion. */ +/* External routine to get a token. Also used nearly everywhere + internally, except for places where we know we can safely call + _cpp_lex_token directly, such as lexing a directive name. + + Macro expansions and directives are transparently handled, + including entering included files. Thus tokens are post-macro + expansion, and after any intervening directives. External callers + see CPP_EOF only at EOF. Internal callers also see it when meeting + a directive inside a macro call, when at the end of a directive and + state.in_directive is still 1, and at the end of argument + pre-expansion. */ +const cpp_token * +cpp_get_token (cpp_reader *pfile) +{ + return cpp_get_token_1 (pfile, NULL); +} + +/* Like cpp_get_token, but also returns a virtual token location + separate from the spelling location carried by the returned token. + + LOC is an out parameter; *LOC is set to the location "as expected + by the user". This matters when a token results from macro + expansion; in that case the token's spelling location indicates the + locus of the token in the definition of the macro but *LOC + virtually encodes all the other meaningful locuses associated to + the token. + + What? virtual location? Yes, virtual location. + + If the token results from macro expansion and if macro expansion + location tracking is enabled its virtual location encodes (at the + same time): + + - the spelling location of the token + + - the locus of the macro expansion point + + - the locus of the point where the token got instantiated as part + of the macro expansion process. + + You have to use the linemap API to get the locus you are interested + in from a given virtual location. + + Note however that virtual locations are not necessarily ordered for + relations '<' and '>'. One must use the function + linemap_location_before_p instead of using the relational operator + '<'. + + If macro expansion tracking is off and if the token results from + macro expansion the virtual location is the expansion point of the + macro that got expanded. + + When the token doesn't result from macro expansion, the virtual + location is just the same thing as its spelling location. */ + const cpp_token * cpp_get_token_with_location (cpp_reader *pfile, source_location *loc) { const cpp_token *result; pfile->set_invocation_location = true; - result = cpp_get_token (pfile); - if (pfile->context->macro) - *loc = pfile->invocation_location; - else - *loc = result->src_loc; - + result = cpp_get_token_1 (pfile, loc); return result; } @@ -1363,7 +2411,7 @@ cpp_get_token_with_location (cpp_reader *pfile, source_location *loc) int cpp_sys_macro_p (cpp_reader *pfile) { - cpp_hashnode *node = pfile->context->macro; + cpp_hashnode *node = pfile->context->c.macro; return node && node->value.macro && node->value.macro->syshdr; } @@ -1420,10 +2468,27 @@ _cpp_backup_tokens (cpp_reader *pfile, unsigned int count) { if (count != 1) abort (); - if (pfile->context->direct_p) + if (pfile->context->tokens_kind == TOKENS_KIND_DIRECT) FIRST (pfile->context).token--; - else + else if (pfile->context->tokens_kind == TOKENS_KIND_INDIRECT) FIRST (pfile->context).ptoken--; + else if (pfile->context->tokens_kind == TOKENS_KIND_EXTENDED) + { + FIRST (pfile->context).ptoken--; + if (pfile->context->c.macro) + { + macro_context *m = pfile->context->c.mc; + m->cur_virt_loc--; +#ifdef ENABLE_CHECKING + if (m->cur_virt_loc < m->virt_locs) + abort (); +#endif + } + else + abort (); + } + else + abort (); } } diff --git a/libcpp/traditional.c b/libcpp/traditional.c index 7ff11bb..4206b6f 100644 --- a/libcpp/traditional.c +++ b/libcpp/traditional.c @@ -738,7 +738,7 @@ recursive_macro (cpp_reader *pfile, cpp_hashnode *node) do { depth++; - if (context->macro == node && depth > 20) + if (context->c.macro == node && depth > 20) break; context = context->prev; }