Message ID | 20230630225914.620150-1-lhyatt@gmail.com |
---|---|
State | New |
Headers | show |
Series | c-family: Implement pragma_lex () for preprocess-only mode | expand |
May I please ping this? I am just about ready with the followup patch that fixes PR87299, but it depends on this one. Thanks! https://gcc.gnu.org/pipermail/gcc-patches/2023-June/623364.html -Lewis On Fri, Jun 30, 2023 at 6:59 PM Lewis Hyatt <lhyatt@gmail.com> wrote: > > In order to support processing #pragma in preprocess-only mode (-E or > -save-temps for gcc/g++), we need a way to obtain the #pragma tokens from > libcpp. In full compilation modes, this is accomplished by calling > pragma_lex (), which is a symbol that must be exported by the frontend, and > which is currently implemented for C and C++. Neither of those frontends > initializes its parser machinery in preprocess-only mode, and consequently > pragma_lex () does not work in this case. > > Address that by adding a new function c_init_preprocess () for the frontends > to implement, which arranges for pragma_lex () to work in preprocess-only > mode, and adjusting pragma_lex () accordingly. > > In preprocess-only mode, the preprocessor is accustomed to controlling the > interaction with libcpp, and it only knows about tokens that it has called > into libcpp itself to obtain. Since it still needs to see the tokens > obtained by pragma_lex () so that they can be streamed to the output, also > add a new libcpp callback, on_token_lex (), that ensures the preprocessor > sees these tokens too. > > Currently, there is one place where we are already supporting #pragma in > preprocess-only mode, namely the handling of `#pragma GCC diagnostic'. That > was done by directly interfacing with libcpp, rather than making use of > pragma_lex (). Now that pragma_lex () works, that code is no longer > necessary; remove it. > > gcc/c-family/ChangeLog: > > * c-common.h (c_init_preprocess): Declare new function. > * c-opts.cc (c_common_init): Call it. > * c-pragma.cc (pragma_diagnostic_lex_normal): Rename to... > (pragma_diagnostic_lex): ...this. > (pragma_diagnostic_lex_pp): Remove. > (handle_pragma_diagnostic_impl): Call pragma_diagnostic_lex () in > all modes. > (c_pp_invoke_early_pragma_handler): Adapt to support pragma_lex () > usage. > * c-pragma.h (pragma_lex_discard_to_eol): Declare new function. > > gcc/c/ChangeLog: > > * c-parser.cc (pragma_lex): Support preprocess-only mode. > (pragma_lex_discard_to_eol): New function. > (c_init_preprocess): New function. > > gcc/cp/ChangeLog: > > * parser.cc (c_init_preprocess): New function. > (maybe_read_tokens_for_pragma_lex): New function. > (pragma_lex): Support preprocess-only mode. > (pragma_lex_discard_to_eol): New funtion. > > libcpp/ChangeLog: > > * include/cpplib.h (struct cpp_callbacks): Add new callback > on_token_lex. > * macro.cc (cpp_get_token_1): Support new callback. > --- > > Notes: > Hello- > > In r13-1544, I added support for processing `#pragma GCC diagnostic' in > preprocess-only mode. Because pragma_lex () doesn't work in that mode, in > that patch I called into libcpp directly to obtain the tokens needed to > process the pragma. As part of the review, Jason noted that it would > probably be better to make pragma_lex () usable in preprocess-only mode, and > we decided just to add a comment about that for the time being, and to go > ahead and implement that in the future, if it became necessary to support > other pragmas during preprocessing. > > I think now is a good time to proceed with that plan, because I would like > to fix PR87299, which is about another pragma (#pragma GCC target) not > working in preprocess-only mode. This patch makes the necessary changes for > pragma_lex () to work in preprocess-only mode. > > I have also added a new callback, on_token_lex (), to libcpp. This is so the > preprocessor can see and stream out all the tokens that pragma_lex () gets > from libcpp, since it won't otherwise see them. This seemed the simplest > approach to me. Another possibility would be to add a wrapper function in > c-family/c-lex.cc, which would call cpp_get_token_with_location(), and then > also stream the token in preprocess-only mode, and then change all calls > into libcpp in that file to use the wrapper function. The libcpp callback > seemed cleaner to me FWIW. > > There are no new tests added here, since it's just a change of > implementation covered by existing tests. Bootstrap + regtest all languages > looks good on x86-64 Linux. > > Please let me know what you think? Thanks! > > -Lewis > > gcc/c-family/c-common.h | 3 +++ > gcc/c-family/c-opts.cc | 1 + > gcc/c-family/c-pragma.cc | 56 ++++++---------------------------------- > gcc/c-family/c-pragma.h | 2 ++ > gcc/c/c-parser.cc | 34 ++++++++++++++++++++++++ > gcc/cp/parser.cc | 50 +++++++++++++++++++++++++++++++++++ > libcpp/include/cpplib.h | 4 +++ > libcpp/macro.cc | 3 +++ > 8 files changed, 105 insertions(+), 48 deletions(-) > > diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h > index b5ef5ff6b2c..78fc5248ba6 100644 > --- a/gcc/c-family/c-common.h > +++ b/gcc/c-family/c-common.h > @@ -990,6 +990,9 @@ extern void c_parse_file (void); > > extern void c_parse_final_cleanups (void); > > +/* This initializes for preprocess-only mode. */ > +extern void c_init_preprocess (void); > + > /* These macros provide convenient access to the various _STMT nodes. */ > > /* Nonzero if a given STATEMENT_LIST represents the outermost binding > diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc > index af19140e382..4961af63de8 100644 > --- a/gcc/c-family/c-opts.cc > +++ b/gcc/c-family/c-opts.cc > @@ -1232,6 +1232,7 @@ c_common_init (void) > if (flag_preprocess_only) > { > c_finish_options (); > + c_init_preprocess (); > preprocess_file (parse_in); > return false; > } > diff --git a/gcc/c-family/c-pragma.cc b/gcc/c-family/c-pragma.cc > index 0d2b333cebb..73d59df3bf4 100644 > --- a/gcc/c-family/c-pragma.cc > +++ b/gcc/c-family/c-pragma.cc > @@ -840,11 +840,11 @@ public: > > }; > > -/* When compiling normally, use pragma_lex () to obtain the needed tokens. > - This will call into either the C or C++ frontends as appropriate. */ > +/* This will call into either the C or C++ frontends as appropriate to get > + tokens from libcpp for the pragma. */ > > static void > -pragma_diagnostic_lex_normal (pragma_diagnostic_data *result) > +pragma_diagnostic_lex (pragma_diagnostic_data *result) > { > result->clear (); > tree x; > @@ -866,46 +866,6 @@ pragma_diagnostic_lex_normal (pragma_diagnostic_data *result) > result->valid = true; > } > > -/* When preprocessing only, pragma_lex () is not available, so obtain the > - tokens directly from libcpp. We also need to inform the token streamer > - about all tokens we lex ourselves here, so it outputs them too; this is > - done by calling c_pp_stream_token () for each. > - > - ??? If we need to support more pragmas in the future, maybe initialize > - this_parser with the pragma tokens and call pragma_lex () instead? */ > - > -static void > -pragma_diagnostic_lex_pp (pragma_diagnostic_data *result) > -{ > - result->clear (); > - > - auto tok = cpp_get_token_with_location (parse_in, &result->loc_kind); > - c_pp_stream_token (parse_in, tok, result->loc_kind); > - if (!(tok->type == CPP_NAME || tok->type == CPP_KEYWORD)) > - return; > - const unsigned char *const kind_u = cpp_token_as_text (parse_in, tok); > - result->set_kind ((const char *)kind_u); > - if (result->pd_kind == pragma_diagnostic_data::PK_INVALID) > - return; > - > - if (result->needs_option ()) > - { > - tok = cpp_get_token_with_location (parse_in, &result->loc_option); > - c_pp_stream_token (parse_in, tok, result->loc_option); > - if (tok->type != CPP_STRING) > - return; > - cpp_string str; > - if (!cpp_interpret_string_notranslate (parse_in, &tok->val.str, 1, &str, > - CPP_STRING) > - || !str.len) > - return; > - result->option_str = (const char *)str.text; > - result->own_option_str = true; > - } > - > - result->valid = true; > -} > - > /* Handle #pragma GCC diagnostic. Early mode is used by frontends (such as C++) > that do not process the deferred pragma while they are consuming tokens; they > can use early mode to make sure diagnostics affecting the preprocessor itself > @@ -916,10 +876,7 @@ handle_pragma_diagnostic_impl () > static const bool want_diagnostics = (is_pp || !early); > > pragma_diagnostic_data data; > - if (is_pp) > - pragma_diagnostic_lex_pp (&data); > - else > - pragma_diagnostic_lex_normal (&data); > + pragma_diagnostic_lex (&data); > > if (!data.kind_str) > { > @@ -1808,7 +1765,10 @@ c_pp_invoke_early_pragma_handler (unsigned int id) > { > const auto data = ®istered_pp_pragmas[id - PRAGMA_FIRST_EXTERNAL]; > if (data->early_handler) > - data->early_handler (parse_in); > + { > + data->early_handler (parse_in); > + pragma_lex_discard_to_eol (); > + } > } > > /* Set up front-end pragmas. */ > diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h > index 9cc95ab3ee3..198fa7723e5 100644 > --- a/gcc/c-family/c-pragma.h > +++ b/gcc/c-family/c-pragma.h > @@ -263,7 +263,9 @@ extern tree maybe_apply_renaming_pragma (tree, tree); > extern void maybe_apply_pragma_scalar_storage_order (tree); > extern void add_to_renaming_pragma_list (tree, tree); > > +/* These are to be implemented in each frontend that needs them. */ > extern enum cpp_ttype pragma_lex (tree *, location_t *loc = NULL); > +extern void pragma_lex_discard_to_eol (); > > /* Flags for use with c_lex_with_flags. The values here were picked > so that 0 means to translate and join strings. */ > diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc > index 24a6eb6e459..aaf6d704fe6 100644 > --- a/gcc/c/c-parser.cc > +++ b/gcc/c/c-parser.cc > @@ -13355,6 +13355,11 @@ c_parser_pragma (c_parser *parser, enum pragma_context context, bool *if_p) > enum cpp_ttype > pragma_lex (tree *value, location_t *loc) > { > + if (flag_preprocess_only) > + /* Arrange for the preprocessor to see the tokens we're about to read, > + since it won't see them later. */ > + cpp_get_callbacks (parse_in)->on_token_lex = c_pp_stream_token; > + > c_token *tok = c_parser_peek_token (the_parser); > enum cpp_ttype ret = tok->type; > > @@ -13373,9 +13378,29 @@ pragma_lex (tree *value, location_t *loc) > c_parser_consume_token (the_parser); > } > > + cpp_get_callbacks (parse_in)->on_token_lex = nullptr; > return ret; > } > > +void > +pragma_lex_discard_to_eol () > +{ > + if (flag_preprocess_only) > + /* Arrange for the preprocessor to see the tokens we're about to read, > + since it won't see them later. */ > + cpp_get_callbacks (parse_in)->on_token_lex = c_pp_stream_token; > + > + cpp_ttype type; > + do > + { > + type = c_parser_peek_token (the_parser)->type; > + gcc_assert (type != CPP_EOF); > + c_parser_consume_token (the_parser); > + } while (type != CPP_PRAGMA_EOL); > + > + cpp_get_callbacks (parse_in)->on_token_lex = nullptr; > +} > + > static void > c_parser_pragma_pch_preprocess (c_parser *parser) > { > @@ -24756,6 +24781,15 @@ c_parse_file (void) > the_parser = NULL; > } > > +void > +c_init_preprocess (void) > +{ > + /* Create a parser for use by pragma_lex during preprocessing. */ > + the_parser = ggc_alloc<c_parser> (); > + memset (the_parser, 0, sizeof (c_parser)); > + the_parser->tokens = &the_parser->tokens_buf[0]; > +} > + > /* Parse the body of a function declaration marked with "__RTL". > > The RTL parser works on the level of characters read from a > diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc > index 5e2b5cba57e..b2f2e222d81 100644 > --- a/gcc/cp/parser.cc > +++ b/gcc/cp/parser.cc > @@ -765,6 +765,15 @@ cp_lexer_new_main (void) > return lexer; > } > > +/* Create a lexer and parser to be used during preprocess-only mode. > + This will be filled with tokens to parse when needed by pragma_lex (). */ > +void > +c_init_preprocess () > +{ > + gcc_assert (!the_parser); > + the_parser = cp_parser_new (cp_lexer_alloc ()); > +} > + > /* Create a new lexer whose token stream is primed with the tokens in > CACHE. When these tokens are exhausted, no new tokens will be read. */ > > @@ -49683,11 +49692,42 @@ cp_parser_pragma (cp_parser *parser, enum pragma_context context, bool *if_p) > return ret; > } > > +/* Helper for pragma_lex in preprocess-only mode; in this mode, we have not > + populated the lexer with any tokens (the tokens rather being read by > + c-ppoutput.c's machinery), so we need to read enough tokens now to handle > + a pragma. */ > +static void > +maybe_read_tokens_for_pragma_lex () > +{ > + const auto lexer = the_parser->lexer; > + if (!lexer->buffer->is_empty ()) > + return; > + > + /* Arrange for the preprocessor to see the tokens we're about to read, > + since it won't see them later. */ > + cpp_get_callbacks (parse_in)->on_token_lex = c_pp_stream_token; > + > + /* Read the rest of the tokens comprising the pragma line. */ > + cp_token *tok; > + do > + { > + tok = vec_safe_push (lexer->buffer, cp_token ()); > + cp_lexer_get_preprocessor_token (C_LEX_STRING_NO_JOIN, tok); > + gcc_assert (tok->type != CPP_EOF); > + } while (tok->type != CPP_PRAGMA_EOL); > + lexer->next_token = lexer->buffer->address (); > + lexer->last_token = lexer->next_token + lexer->buffer->length () - 1; > + cpp_get_callbacks (parse_in)->on_token_lex = nullptr; > +} > + > /* The interface the pragma parsers have to the lexer. */ > > enum cpp_ttype > pragma_lex (tree *value, location_t *loc) > { > + if (flag_preprocess_only) > + maybe_read_tokens_for_pragma_lex (); > + > cp_token *tok = cp_lexer_peek_token (the_parser->lexer); > enum cpp_ttype ret = tok->type; > > @@ -49710,6 +49750,16 @@ pragma_lex (tree *value, location_t *loc) > return ret; > } > > +void > +pragma_lex_discard_to_eol () > +{ > + /* We have already read all the tokens, so we just need to discard > + them here. */ > + const auto lexer = the_parser->lexer; > + lexer->next_token = lexer->last_token; > + lexer->buffer->truncate (0); > +} > + > > /* External interface. */ > > diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h > index aef703f8111..8b63204df0e 100644 > --- a/libcpp/include/cpplib.h > +++ b/libcpp/include/cpplib.h > @@ -784,6 +784,10 @@ struct cpp_callbacks > cpp_buffer containing the translation if translating. */ > char *(*translate_include) (cpp_reader *, line_maps *, location_t, > const char *path); > + > + /* Called when cpp_get_token() / cpp_get_token_with_location() > + have produced a token. */ > + void (*on_token_lex) (cpp_reader *, const cpp_token *, location_t); > }; > > #ifdef VMS > diff --git a/libcpp/macro.cc b/libcpp/macro.cc > index dada8fea835..ebbc1618a71 100644 > --- a/libcpp/macro.cc > +++ b/libcpp/macro.cc > @@ -3135,6 +3135,9 @@ cpp_get_token_1 (cpp_reader *pfile, location_t *location) > } > } > > + if (pfile->cb.on_token_lex) > + pfile->cb.on_token_lex (pfile, result, > + location ? *location : result->src_loc); > return result; > } >
On 6/30/23 18:59, Lewis Hyatt wrote: > In order to support processing #pragma in preprocess-only mode (-E or > -save-temps for gcc/g++), we need a way to obtain the #pragma tokens from > libcpp. In full compilation modes, this is accomplished by calling > pragma_lex (), which is a symbol that must be exported by the frontend, and > which is currently implemented for C and C++. Neither of those frontends > initializes its parser machinery in preprocess-only mode, and consequently > pragma_lex () does not work in this case. > > Address that by adding a new function c_init_preprocess () for the frontends > to implement, which arranges for pragma_lex () to work in preprocess-only > mode, and adjusting pragma_lex () accordingly. > > In preprocess-only mode, the preprocessor is accustomed to controlling the > interaction with libcpp, and it only knows about tokens that it has called > into libcpp itself to obtain. Since it still needs to see the tokens > obtained by pragma_lex () so that they can be streamed to the output, also > add a new libcpp callback, on_token_lex (), that ensures the preprocessor > sees these tokens too. > > Currently, there is one place where we are already supporting #pragma in > preprocess-only mode, namely the handling of `#pragma GCC diagnostic'. That > was done by directly interfacing with libcpp, rather than making use of > pragma_lex (). Now that pragma_lex () works, that code is no longer > necessary; remove it. > > gcc/c-family/ChangeLog: > > * c-common.h (c_init_preprocess): Declare new function. > * c-opts.cc (c_common_init): Call it. > * c-pragma.cc (pragma_diagnostic_lex_normal): Rename to... > (pragma_diagnostic_lex): ...this. > (pragma_diagnostic_lex_pp): Remove. > (handle_pragma_diagnostic_impl): Call pragma_diagnostic_lex () in > all modes. > (c_pp_invoke_early_pragma_handler): Adapt to support pragma_lex () > usage. > * c-pragma.h (pragma_lex_discard_to_eol): Declare new function. > > gcc/c/ChangeLog: > > * c-parser.cc (pragma_lex): Support preprocess-only mode. > (pragma_lex_discard_to_eol): New function. > (c_init_preprocess): New function. > > gcc/cp/ChangeLog: > > * parser.cc (c_init_preprocess): New function. > (maybe_read_tokens_for_pragma_lex): New function. > (pragma_lex): Support preprocess-only mode. > (pragma_lex_discard_to_eol): New funtion. > > libcpp/ChangeLog: > > * include/cpplib.h (struct cpp_callbacks): Add new callback > on_token_lex. > * macro.cc (cpp_get_token_1): Support new callback. > --- > > Notes: > Hello- > > In r13-1544, I added support for processing `#pragma GCC diagnostic' in > preprocess-only mode. Because pragma_lex () doesn't work in that mode, in > that patch I called into libcpp directly to obtain the tokens needed to > process the pragma. As part of the review, Jason noted that it would > probably be better to make pragma_lex () usable in preprocess-only mode, and > we decided just to add a comment about that for the time being, and to go > ahead and implement that in the future, if it became necessary to support > other pragmas during preprocessing. > > I think now is a good time to proceed with that plan, because I would like > to fix PR87299, which is about another pragma (#pragma GCC target) not > working in preprocess-only mode. This patch makes the necessary changes for > pragma_lex () to work in preprocess-only mode. > > I have also added a new callback, on_token_lex (), to libcpp. This is so the > preprocessor can see and stream out all the tokens that pragma_lex () gets > from libcpp, since it won't otherwise see them. This seemed the simplest > approach to me. Another possibility would be to add a wrapper function in > c-family/c-lex.cc, which would call cpp_get_token_with_location(), and then > also stream the token in preprocess-only mode, and then change all calls > into libcpp in that file to use the wrapper function. The libcpp callback > seemed cleaner to me FWIW. I think the other way sounds better to me; there are only three calls to cpp_get_... in c_lex_with_flags. The rest of the patch looks good. Jason
On Wed, Jul 26, 2023 at 5:36 PM Jason Merrill <jason@redhat.com> wrote: > > On 6/30/23 18:59, Lewis Hyatt wrote: > > In order to support processing #pragma in preprocess-only mode (-E or > > -save-temps for gcc/g++), we need a way to obtain the #pragma tokens from > > libcpp. In full compilation modes, this is accomplished by calling > > pragma_lex (), which is a symbol that must be exported by the frontend, and > > which is currently implemented for C and C++. Neither of those frontends > > initializes its parser machinery in preprocess-only mode, and consequently > > pragma_lex () does not work in this case. > > > > Address that by adding a new function c_init_preprocess () for the frontends > > to implement, which arranges for pragma_lex () to work in preprocess-only > > mode, and adjusting pragma_lex () accordingly. > > > > In preprocess-only mode, the preprocessor is accustomed to controlling the > > interaction with libcpp, and it only knows about tokens that it has called > > into libcpp itself to obtain. Since it still needs to see the tokens > > obtained by pragma_lex () so that they can be streamed to the output, also > > add a new libcpp callback, on_token_lex (), that ensures the preprocessor > > sees these tokens too. > > > > Currently, there is one place where we are already supporting #pragma in > > preprocess-only mode, namely the handling of `#pragma GCC diagnostic'. That > > was done by directly interfacing with libcpp, rather than making use of > > pragma_lex (). Now that pragma_lex () works, that code is no longer > > necessary; remove it. > > > > gcc/c-family/ChangeLog: > > > > * c-common.h (c_init_preprocess): Declare new function. > > * c-opts.cc (c_common_init): Call it. > > * c-pragma.cc (pragma_diagnostic_lex_normal): Rename to... > > (pragma_diagnostic_lex): ...this. > > (pragma_diagnostic_lex_pp): Remove. > > (handle_pragma_diagnostic_impl): Call pragma_diagnostic_lex () in > > all modes. > > (c_pp_invoke_early_pragma_handler): Adapt to support pragma_lex () > > usage. > > * c-pragma.h (pragma_lex_discard_to_eol): Declare new function. > > > > gcc/c/ChangeLog: > > > > * c-parser.cc (pragma_lex): Support preprocess-only mode. > > (pragma_lex_discard_to_eol): New function. > > (c_init_preprocess): New function. > > > > gcc/cp/ChangeLog: > > > > * parser.cc (c_init_preprocess): New function. > > (maybe_read_tokens_for_pragma_lex): New function. > > (pragma_lex): Support preprocess-only mode. > > (pragma_lex_discard_to_eol): New funtion. > > > > libcpp/ChangeLog: > > > > * include/cpplib.h (struct cpp_callbacks): Add new callback > > on_token_lex. > > * macro.cc (cpp_get_token_1): Support new callback. > > --- > > > > Notes: > > Hello- > > > > In r13-1544, I added support for processing `#pragma GCC diagnostic' in > > preprocess-only mode. Because pragma_lex () doesn't work in that mode, in > > that patch I called into libcpp directly to obtain the tokens needed to > > process the pragma. As part of the review, Jason noted that it would > > probably be better to make pragma_lex () usable in preprocess-only mode, and > > we decided just to add a comment about that for the time being, and to go > > ahead and implement that in the future, if it became necessary to support > > other pragmas during preprocessing. > > > > I think now is a good time to proceed with that plan, because I would like > > to fix PR87299, which is about another pragma (#pragma GCC target) not > > working in preprocess-only mode. This patch makes the necessary changes for > > pragma_lex () to work in preprocess-only mode. > > > > I have also added a new callback, on_token_lex (), to libcpp. This is so the > > preprocessor can see and stream out all the tokens that pragma_lex () gets > > from libcpp, since it won't otherwise see them. This seemed the simplest > > approach to me. Another possibility would be to add a wrapper function in > > c-family/c-lex.cc, which would call cpp_get_token_with_location(), and then > > also stream the token in preprocess-only mode, and then change all calls > > into libcpp in that file to use the wrapper function. The libcpp callback > > seemed cleaner to me FWIW. > > I think the other way sounds better to me; there are only three calls to > cpp_get_... in c_lex_with_flags. > > The rest of the patch looks good. Thank you very much for the feedback. I will test it this way and send the updated version. -Lewis
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h index b5ef5ff6b2c..78fc5248ba6 100644 --- a/gcc/c-family/c-common.h +++ b/gcc/c-family/c-common.h @@ -990,6 +990,9 @@ extern void c_parse_file (void); extern void c_parse_final_cleanups (void); +/* This initializes for preprocess-only mode. */ +extern void c_init_preprocess (void); + /* These macros provide convenient access to the various _STMT nodes. */ /* Nonzero if a given STATEMENT_LIST represents the outermost binding diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc index af19140e382..4961af63de8 100644 --- a/gcc/c-family/c-opts.cc +++ b/gcc/c-family/c-opts.cc @@ -1232,6 +1232,7 @@ c_common_init (void) if (flag_preprocess_only) { c_finish_options (); + c_init_preprocess (); preprocess_file (parse_in); return false; } diff --git a/gcc/c-family/c-pragma.cc b/gcc/c-family/c-pragma.cc index 0d2b333cebb..73d59df3bf4 100644 --- a/gcc/c-family/c-pragma.cc +++ b/gcc/c-family/c-pragma.cc @@ -840,11 +840,11 @@ public: }; -/* When compiling normally, use pragma_lex () to obtain the needed tokens. - This will call into either the C or C++ frontends as appropriate. */ +/* This will call into either the C or C++ frontends as appropriate to get + tokens from libcpp for the pragma. */ static void -pragma_diagnostic_lex_normal (pragma_diagnostic_data *result) +pragma_diagnostic_lex (pragma_diagnostic_data *result) { result->clear (); tree x; @@ -866,46 +866,6 @@ pragma_diagnostic_lex_normal (pragma_diagnostic_data *result) result->valid = true; } -/* When preprocessing only, pragma_lex () is not available, so obtain the - tokens directly from libcpp. We also need to inform the token streamer - about all tokens we lex ourselves here, so it outputs them too; this is - done by calling c_pp_stream_token () for each. - - ??? If we need to support more pragmas in the future, maybe initialize - this_parser with the pragma tokens and call pragma_lex () instead? */ - -static void -pragma_diagnostic_lex_pp (pragma_diagnostic_data *result) -{ - result->clear (); - - auto tok = cpp_get_token_with_location (parse_in, &result->loc_kind); - c_pp_stream_token (parse_in, tok, result->loc_kind); - if (!(tok->type == CPP_NAME || tok->type == CPP_KEYWORD)) - return; - const unsigned char *const kind_u = cpp_token_as_text (parse_in, tok); - result->set_kind ((const char *)kind_u); - if (result->pd_kind == pragma_diagnostic_data::PK_INVALID) - return; - - if (result->needs_option ()) - { - tok = cpp_get_token_with_location (parse_in, &result->loc_option); - c_pp_stream_token (parse_in, tok, result->loc_option); - if (tok->type != CPP_STRING) - return; - cpp_string str; - if (!cpp_interpret_string_notranslate (parse_in, &tok->val.str, 1, &str, - CPP_STRING) - || !str.len) - return; - result->option_str = (const char *)str.text; - result->own_option_str = true; - } - - result->valid = true; -} - /* Handle #pragma GCC diagnostic. Early mode is used by frontends (such as C++) that do not process the deferred pragma while they are consuming tokens; they can use early mode to make sure diagnostics affecting the preprocessor itself @@ -916,10 +876,7 @@ handle_pragma_diagnostic_impl () static const bool want_diagnostics = (is_pp || !early); pragma_diagnostic_data data; - if (is_pp) - pragma_diagnostic_lex_pp (&data); - else - pragma_diagnostic_lex_normal (&data); + pragma_diagnostic_lex (&data); if (!data.kind_str) { @@ -1808,7 +1765,10 @@ c_pp_invoke_early_pragma_handler (unsigned int id) { const auto data = ®istered_pp_pragmas[id - PRAGMA_FIRST_EXTERNAL]; if (data->early_handler) - data->early_handler (parse_in); + { + data->early_handler (parse_in); + pragma_lex_discard_to_eol (); + } } /* Set up front-end pragmas. */ diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h index 9cc95ab3ee3..198fa7723e5 100644 --- a/gcc/c-family/c-pragma.h +++ b/gcc/c-family/c-pragma.h @@ -263,7 +263,9 @@ extern tree maybe_apply_renaming_pragma (tree, tree); extern void maybe_apply_pragma_scalar_storage_order (tree); extern void add_to_renaming_pragma_list (tree, tree); +/* These are to be implemented in each frontend that needs them. */ extern enum cpp_ttype pragma_lex (tree *, location_t *loc = NULL); +extern void pragma_lex_discard_to_eol (); /* Flags for use with c_lex_with_flags. The values here were picked so that 0 means to translate and join strings. */ diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index 24a6eb6e459..aaf6d704fe6 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -13355,6 +13355,11 @@ c_parser_pragma (c_parser *parser, enum pragma_context context, bool *if_p) enum cpp_ttype pragma_lex (tree *value, location_t *loc) { + if (flag_preprocess_only) + /* Arrange for the preprocessor to see the tokens we're about to read, + since it won't see them later. */ + cpp_get_callbacks (parse_in)->on_token_lex = c_pp_stream_token; + c_token *tok = c_parser_peek_token (the_parser); enum cpp_ttype ret = tok->type; @@ -13373,9 +13378,29 @@ pragma_lex (tree *value, location_t *loc) c_parser_consume_token (the_parser); } + cpp_get_callbacks (parse_in)->on_token_lex = nullptr; return ret; } +void +pragma_lex_discard_to_eol () +{ + if (flag_preprocess_only) + /* Arrange for the preprocessor to see the tokens we're about to read, + since it won't see them later. */ + cpp_get_callbacks (parse_in)->on_token_lex = c_pp_stream_token; + + cpp_ttype type; + do + { + type = c_parser_peek_token (the_parser)->type; + gcc_assert (type != CPP_EOF); + c_parser_consume_token (the_parser); + } while (type != CPP_PRAGMA_EOL); + + cpp_get_callbacks (parse_in)->on_token_lex = nullptr; +} + static void c_parser_pragma_pch_preprocess (c_parser *parser) { @@ -24756,6 +24781,15 @@ c_parse_file (void) the_parser = NULL; } +void +c_init_preprocess (void) +{ + /* Create a parser for use by pragma_lex during preprocessing. */ + the_parser = ggc_alloc<c_parser> (); + memset (the_parser, 0, sizeof (c_parser)); + the_parser->tokens = &the_parser->tokens_buf[0]; +} + /* Parse the body of a function declaration marked with "__RTL". The RTL parser works on the level of characters read from a diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index 5e2b5cba57e..b2f2e222d81 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -765,6 +765,15 @@ cp_lexer_new_main (void) return lexer; } +/* Create a lexer and parser to be used during preprocess-only mode. + This will be filled with tokens to parse when needed by pragma_lex (). */ +void +c_init_preprocess () +{ + gcc_assert (!the_parser); + the_parser = cp_parser_new (cp_lexer_alloc ()); +} + /* Create a new lexer whose token stream is primed with the tokens in CACHE. When these tokens are exhausted, no new tokens will be read. */ @@ -49683,11 +49692,42 @@ cp_parser_pragma (cp_parser *parser, enum pragma_context context, bool *if_p) return ret; } +/* Helper for pragma_lex in preprocess-only mode; in this mode, we have not + populated the lexer with any tokens (the tokens rather being read by + c-ppoutput.c's machinery), so we need to read enough tokens now to handle + a pragma. */ +static void +maybe_read_tokens_for_pragma_lex () +{ + const auto lexer = the_parser->lexer; + if (!lexer->buffer->is_empty ()) + return; + + /* Arrange for the preprocessor to see the tokens we're about to read, + since it won't see them later. */ + cpp_get_callbacks (parse_in)->on_token_lex = c_pp_stream_token; + + /* Read the rest of the tokens comprising the pragma line. */ + cp_token *tok; + do + { + tok = vec_safe_push (lexer->buffer, cp_token ()); + cp_lexer_get_preprocessor_token (C_LEX_STRING_NO_JOIN, tok); + gcc_assert (tok->type != CPP_EOF); + } while (tok->type != CPP_PRAGMA_EOL); + lexer->next_token = lexer->buffer->address (); + lexer->last_token = lexer->next_token + lexer->buffer->length () - 1; + cpp_get_callbacks (parse_in)->on_token_lex = nullptr; +} + /* The interface the pragma parsers have to the lexer. */ enum cpp_ttype pragma_lex (tree *value, location_t *loc) { + if (flag_preprocess_only) + maybe_read_tokens_for_pragma_lex (); + cp_token *tok = cp_lexer_peek_token (the_parser->lexer); enum cpp_ttype ret = tok->type; @@ -49710,6 +49750,16 @@ pragma_lex (tree *value, location_t *loc) return ret; } +void +pragma_lex_discard_to_eol () +{ + /* We have already read all the tokens, so we just need to discard + them here. */ + const auto lexer = the_parser->lexer; + lexer->next_token = lexer->last_token; + lexer->buffer->truncate (0); +} + /* External interface. */ diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h index aef703f8111..8b63204df0e 100644 --- a/libcpp/include/cpplib.h +++ b/libcpp/include/cpplib.h @@ -784,6 +784,10 @@ struct cpp_callbacks cpp_buffer containing the translation if translating. */ char *(*translate_include) (cpp_reader *, line_maps *, location_t, const char *path); + + /* Called when cpp_get_token() / cpp_get_token_with_location() + have produced a token. */ + void (*on_token_lex) (cpp_reader *, const cpp_token *, location_t); }; #ifdef VMS diff --git a/libcpp/macro.cc b/libcpp/macro.cc index dada8fea835..ebbc1618a71 100644 --- a/libcpp/macro.cc +++ b/libcpp/macro.cc @@ -3135,6 +3135,9 @@ cpp_get_token_1 (cpp_reader *pfile, location_t *location) } } + if (pfile->cb.on_token_lex) + pfile->cb.on_token_lex (pfile, result, + location ? *location : result->src_loc); return result; }