diff mbox series

lto: Cache location_ts including BLOCKs in GIMPLE streaming [PR94311]

Message ID 20200903080804.GH18149@tucnak
State New
Headers show
Series lto: Cache location_ts including BLOCKs in GIMPLE streaming [PR94311] | expand

Commit Message

Jakub Jelinek Sept. 3, 2020, 8:08 a.m. UTC
Hi!

As mentioned in the PR, when compiling valgrind even on fairly small
testcase where in one larger function the location keeps oscillating
between a small line number and 8000-ish line number in the same file
we very quickly run out of all possible location_t numbers and because of
that emit non-sensical line numbers in .debug_line.
There are ways how to decrease speed of depleting location_t numbers
in libcpp, but the main reason of this is that we use
stream_input_location_now for streaming in location_t for gimple_location
and phi arg locations.  libcpp strongly prefers that the locations
it is given are sorted by the different files and by line numbers in
ascending order, otherwise it depletes quickly no matter what and is much
more costly (many extra file changes etc.).
The reason for not caching those were the BLOCKs that were streamed
immediately after the location and encoded into the locations (and for PHIs
we failed to stream the BLOCKs altogether).
This patch enhances the location cache to handle also BLOCKs (but not for
everything, only for the spots we care about the BLOCKs) and also optimizes
the size of the LTO stream by emitting a single bit into a pack whether the
BLOCK changed from last case and only streaming the BLOCK tree if it
changed.

Bootstapped/regtested on x86_64-linux with
../configure --enable-languages=default,ada,obj-c++,lto,go,brig,d --with-build-config=bootstrap-lto --enable-checking=yes,rtl,extra && make -j16 bootstrap > LOG 2>&1 && GXX_TESTSUITE_STDS=98,11,14,17,2a make -j32 -k check > LOGC 2>&1; ../contrib/test_summary > LOGT 2>&1
as well as normally bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

BTW, I wonder about one preexisting issue (but that can be done
incrementally).  On the streamer out side, we call clear_line_info
in multiple spots which resets the current_* values to something, but on the
reader side, we don't have corresponding resets in the same location, just have
the stream_* static variables that keep the current values through the
entire stream in (so across all the clear_line_info spots in a single LTO
object but also across jumping from one LTO object to another one).
Now, in an earlier version of my patch it actually broke LTO bootstrap
(and a lot of LTO testcases), so for the BLOCK case I've solved it by
clear_line_info setting current_block to something that should never appear,
which means that in the LTO stream after the clear_line_info spots including
the start of the LTO stream we force the block change bit to be set and thus
BLOCK to be streamed and therefore stream_block from earlier to be
ignored.  But for the rest I think that is not the case, so I wonder if we
don't sometimes end up with wrong line/column info because of that, or
please tell me what prevents that.
clear_line_info does:
  ob->current_file = NULL;
  ob->current_line = 0;
  ob->current_col = 0;
  ob->current_sysp = false;
while I think NULL current_file is something that should likely be different
from expanded_location (...).file (UNKNOWN_LOCATION/BUILTINS_LOCATION are
handled separately and not go through the caching), I think line number 0
can sometimes occur and especially column 0 occurs frequently if we ran out
of location_t with columns info.  But then we do:
      bp_pack_value (bp, ob->current_file != xloc.file, 1);
      bp_pack_value (bp, ob->current_line != xloc.line, 1);
      bp_pack_value (bp, ob->current_col != xloc.column, 1);
and stream the details only if the != is true.  If that happens immediately
after clear_line_info and e.g. xloc.column is 0, we would stream 0 bit and
not stream the actual value, so on read-in it would reuse whatever
stream_col etc. were before.  Shouldn't we set some ob->current_* new bit
that would signal we are immediately past clear_line_info which would force
all these != checks to non-zero?  Either by oring something into those
tests, or perhaps:
  if (ob->current_reset)
    {
      if (xloc.file == NULL)
	ob->current_file = "";
      if (xloc.line == 0)
	ob->current_line = 1;
      if (xloc.column == 0)
	ob->current_column = 1;
      ob->current_reset = false;
    }
before doing those bp_pack_value calls with a comment, effectively forcing
all 6 != comparisons to be true?

2020-09-03  Jakub Jelinek  <jakub@redhat.com>

	PR lto/94311
	* gimple.h (gimple_location_ptr, gimple_phi_arg_location_ptr): New
	functions.
	* streamer-hooks.h (struct streamer_hooks): Add
	output_location_and_block callback.  Fix up formatting for
	output_location.
	(stream_output_location_and_block): Define.
	* lto-streamer.h (class lto_location_cache): Fix comment typo.  Add
	current_block member.
	(lto_location_cache::input_location_and_block): New method.
	(lto_location_cache::lto_location_cache): Initialize current_block.
	(lto_location_cache::cached_location): Add block member.
	(struct output_block): Add current_block member.
	(lto_output_location): Formatting fix.
	(lto_output_location_and_block): Declare.
	* lto-streamer.c (lto_streamer_hooks_init): Initialize
	streamer_hooks.output_location_and_block.
	* lto-streamer-in.c (lto_location_cache::cmp_loc): Also compare
	block members.
	(lto_location_cache::apply_location_cache): Handle blocks.
	(lto_location_cache::accept_location_cache,
	lto_location_cache::revert_location_cache): Fix up function comments.
	(lto_location_cache::input_location_and_block): New method.
	(lto_location_cache::input_location): Implement using
	input_location_and_block.
	(input_function): Invoke apply_location_cache after streaming in all
	bbs.
	* lto-streamer-out.c (clear_line_info): Set current_block.
	(lto_output_location_1): New function, moved from lto_output_location,
	added block handling.
	(lto_output_location): Implement using lto_output_location_1.
	(lto_output_location_and_block): New function.
	* gimple-streamer-in.c (input_phi): Use input_location_and_block
	to input and cache both location and block.
	(input_gimple_stmt): Likewise.
	* gimple-streamer-out.c (output_phi): Use
	stream_output_location_and_block.
	(output_gimple_stmt): Likewise.


	Jakub

Comments

Richard Biener Sept. 3, 2020, 10:07 a.m. UTC | #1
On Thu, 3 Sep 2020, Jakub Jelinek wrote:

> Hi!
> 
> As mentioned in the PR, when compiling valgrind even on fairly small
> testcase where in one larger function the location keeps oscillating
> between a small line number and 8000-ish line number in the same file
> we very quickly run out of all possible location_t numbers and because of
> that emit non-sensical line numbers in .debug_line.
> There are ways how to decrease speed of depleting location_t numbers
> in libcpp, but the main reason of this is that we use
> stream_input_location_now for streaming in location_t for gimple_location
> and phi arg locations.  libcpp strongly prefers that the locations
> it is given are sorted by the different files and by line numbers in
> ascending order, otherwise it depletes quickly no matter what and is much
> more costly (many extra file changes etc.).
> The reason for not caching those were the BLOCKs that were streamed
> immediately after the location and encoded into the locations (and for PHIs
> we failed to stream the BLOCKs altogether).
> This patch enhances the location cache to handle also BLOCKs (but not for
> everything, only for the spots we care about the BLOCKs) and also optimizes
> the size of the LTO stream by emitting a single bit into a pack whether the
> BLOCK changed from last case and only streaming the BLOCK tree if it
> changed.
> 
> Bootstapped/regtested on x86_64-linux with
> ../configure --enable-languages=default,ada,obj-c++,lto,go,brig,d --with-build-config=bootstrap-lto --enable-checking=yes,rtl,extra && make -j16 bootstrap > LOG 2>&1 && GXX_TESTSUITE_STDS=98,11,14,17,2a make -j32 -k check > LOGC 2>&1; ../contrib/test_summary > LOGT 2>&1
> as well as normally bootstrapped/regtested on x86_64-linux and i686-linux,
> ok for trunk?

LGTM.

Can we now remove stream_input_location_now?

> BTW, I wonder about one preexisting issue (but that can be done
> incrementally).  On the streamer out side, we call clear_line_info
> in multiple spots which resets the current_* values to something, but on the
> reader side, we don't have corresponding resets in the same location, just have
> the stream_* static variables that keep the current values through the
> entire stream in (so across all the clear_line_info spots in a single LTO
> object but also across jumping from one LTO object to another one).
> Now, in an earlier version of my patch it actually broke LTO bootstrap
> (and a lot of LTO testcases), so for the BLOCK case I've solved it by
> clear_line_info setting current_block to something that should never appear,
> which means that in the LTO stream after the clear_line_info spots including
> the start of the LTO stream we force the block change bit to be set and thus
> BLOCK to be streamed and therefore stream_block from earlier to be
> ignored.  But for the rest I think that is not the case, so I wonder if we
> don't sometimes end up with wrong line/column info because of that, or
> please tell me what prevents that.
> clear_line_info does:
>   ob->current_file = NULL;
>   ob->current_line = 0;
>   ob->current_col = 0;
>   ob->current_sysp = false;
> while I think NULL current_file is something that should likely be different
> from expanded_location (...).file (UNKNOWN_LOCATION/BUILTINS_LOCATION are
> handled separately and not go through the caching), I think line number 0
> can sometimes occur and especially column 0 occurs frequently if we ran out
> of location_t with columns info.  But then we do:
>       bp_pack_value (bp, ob->current_file != xloc.file, 1);
>       bp_pack_value (bp, ob->current_line != xloc.line, 1);
>       bp_pack_value (bp, ob->current_col != xloc.column, 1);
> and stream the details only if the != is true.  If that happens immediately
> after clear_line_info and e.g. xloc.column is 0, we would stream 0 bit and
> not stream the actual value, so on read-in it would reuse whatever
> stream_col etc. were before.  Shouldn't we set some ob->current_* new bit
> that would signal we are immediately past clear_line_info which would force
> all these != checks to non-zero?  Either by oring something into those
> tests, or perhaps:
>   if (ob->current_reset)
>     {
>       if (xloc.file == NULL)
> 	ob->current_file = "";
>       if (xloc.line == 0)
> 	ob->current_line = 1;
>       if (xloc.column == 0)
> 	ob->current_column = 1;
>       ob->current_reset = false;
>     }
> before doing those bp_pack_value calls with a comment, effectively forcing
> all 6 != comparisons to be true?

Yeah, guess that would be cleanest.  How does UNKNOWN_LOCATION
expand btw?  Using that as "reset" value also might work?

Thanks,
Richard.

> 2020-09-03  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR lto/94311
> 	* gimple.h (gimple_location_ptr, gimple_phi_arg_location_ptr): New
> 	functions.
> 	* streamer-hooks.h (struct streamer_hooks): Add
> 	output_location_and_block callback.  Fix up formatting for
> 	output_location.
> 	(stream_output_location_and_block): Define.
> 	* lto-streamer.h (class lto_location_cache): Fix comment typo.  Add
> 	current_block member.
> 	(lto_location_cache::input_location_and_block): New method.
> 	(lto_location_cache::lto_location_cache): Initialize current_block.
> 	(lto_location_cache::cached_location): Add block member.
> 	(struct output_block): Add current_block member.
> 	(lto_output_location): Formatting fix.
> 	(lto_output_location_and_block): Declare.
> 	* lto-streamer.c (lto_streamer_hooks_init): Initialize
> 	streamer_hooks.output_location_and_block.
> 	* lto-streamer-in.c (lto_location_cache::cmp_loc): Also compare
> 	block members.
> 	(lto_location_cache::apply_location_cache): Handle blocks.
> 	(lto_location_cache::accept_location_cache,
> 	lto_location_cache::revert_location_cache): Fix up function comments.
> 	(lto_location_cache::input_location_and_block): New method.
> 	(lto_location_cache::input_location): Implement using
> 	input_location_and_block.
> 	(input_function): Invoke apply_location_cache after streaming in all
> 	bbs.
> 	* lto-streamer-out.c (clear_line_info): Set current_block.
> 	(lto_output_location_1): New function, moved from lto_output_location,
> 	added block handling.
> 	(lto_output_location): Implement using lto_output_location_1.
> 	(lto_output_location_and_block): New function.
> 	* gimple-streamer-in.c (input_phi): Use input_location_and_block
> 	to input and cache both location and block.
> 	(input_gimple_stmt): Likewise.
> 	* gimple-streamer-out.c (output_phi): Use
> 	stream_output_location_and_block.
> 	(output_gimple_stmt): Likewise.
> 
> --- gcc/gimple.h.jj	2020-08-03 22:54:51.426531549 +0200
> +++ gcc/gimple.h	2020-09-02 13:51:38.841642793 +0200
> @@ -1879,6 +1879,14 @@ gimple_set_location (gimple *g, location
>    g->location = location;
>  }
>  
> +/* Return address of the location information for statement G.  */
> +
> +static inline location_t *
> +gimple_location_ptr (gimple *g)
> +{
> +  return &g->location;
> +}
> +
>  
>  /* Return true if G contains location information.  */
>  
> @@ -4581,6 +4589,14 @@ gimple_phi_arg_set_location (gphi *phi,
>    gimple_phi_arg (phi, i)->locus = loc;
>  }
>  
> +/* Return address of source location of gimple argument I of phi node PHI.  */
> +
> +static inline location_t *
> +gimple_phi_arg_location_ptr (gphi *phi, size_t i)
> +{
> +  return &gimple_phi_arg (phi, i)->locus;
> +}
> +
>  /* Return TRUE if argument I of phi node PHI has a location record.  */
>  
>  static inline bool
> --- gcc/streamer-hooks.h.jj	2020-01-12 11:54:36.936405518 +0100
> +++ gcc/streamer-hooks.h	2020-09-02 14:18:03.304429642 +0200
> @@ -54,8 +54,15 @@ struct streamer_hooks {
>    /* [REQ] Called by every streaming routine that needs to read a location.  */
>    void (*input_location) (location_t *, struct bitpack_d *, class data_in *);
>  
> -  /* [REQ] Called by every streaming routine that needs to write a location.  */
> -  void (*output_location) (struct output_block *, struct bitpack_d *, location_t);
> +  /* [REQ] Called by every streaming routine that needs to write a
> +     location.  */
> +  void (*output_location) (struct output_block *, struct bitpack_d *,
> +			   location_t);
> +
> +  /* [REQ] Called by every streaming routine that needs to write a
> +     location, both LOCATION_LOCUS and LOCATION_BLOCK.  */
> +  void (*output_location_and_block) (struct output_block *, struct bitpack_d *,
> +				     location_t);
>  };
>  
>  #define stream_write_tree(OB, EXPR, REF_P) \
> @@ -73,6 +80,9 @@ struct streamer_hooks {
>  #define stream_output_location(OB, BP, LOC) \
>      streamer_hooks.output_location (OB, BP, LOC)
>  
> +#define stream_output_location_and_block(OB, BP, LOC) \
> +    streamer_hooks.output_location_and_block (OB, BP, LOC)
> +
>  /* Streamer hooks.  */
>  extern struct streamer_hooks streamer_hooks;
>  
> --- gcc/lto-streamer.h.jj	2020-08-03 22:54:51.429531506 +0200
> +++ gcc/lto-streamer.h	2020-09-02 14:21:13.829639521 +0200
> @@ -262,7 +262,7 @@ typedef void (lto_free_section_data_f) (
>  
>  /* The location cache holds expanded locations for streamed in trees.
>     This is done to reduce memory usage of libcpp linemap that strongly prefers
> -   locations to be inserted in the soruce order.  */
> +   locations to be inserted in the source order.  */
>  
>  class lto_location_cache
>  {
> @@ -276,9 +276,13 @@ public:
>    void revert_location_cache ();
>    void input_location (location_t *loc, struct bitpack_d *bp,
>  		       class data_in *data_in);
> +  void input_location_and_block (location_t *loc, struct bitpack_d *bp,
> +				 class lto_input_block *ib,
> +				 class data_in *data_in);
>    lto_location_cache ()
>       : loc_cache (), accepted_length (0), current_file (NULL), current_line (0),
> -       current_col (0), current_sysp (false), current_loc (UNKNOWN_LOCATION)
> +       current_col (0), current_sysp (false), current_loc (UNKNOWN_LOCATION),
> +       current_block (NULL_TREE)
>    {
>      gcc_assert (!current_cache);
>      current_cache = this;
> @@ -304,6 +308,7 @@ private:
>      location_t *loc;
>      int line, col;
>      bool sysp;
> +    tree block;
>    };
>  
>    /* The location cache.  */
> @@ -325,6 +330,7 @@ private:
>    int current_col;
>    bool current_sysp;
>    location_t current_loc;
> +  tree current_block;
>  };
>  
>  /* Structure used as buffer for reading an LTO file.  */
> @@ -711,6 +717,7 @@ struct output_block
>    int current_line;
>    int current_col;
>    bool current_sysp;
> +  tree current_block;
>  
>    /* Cache of nodes written in this section.  */
>    struct streamer_tree_cache_d *writer_cache;
> @@ -881,7 +888,10 @@ void lto_output_decl_state_streams (stru
>  void lto_output_decl_state_refs (struct output_block *,
>  			         struct lto_output_stream *,
>  			         struct lto_out_decl_state *);
> -void lto_output_location (struct output_block *, struct bitpack_d *, location_t);
> +void lto_output_location (struct output_block *, struct bitpack_d *,
> +			  location_t);
> +void lto_output_location_and_block (struct output_block *, struct bitpack_d *,
> +				    location_t);
>  void lto_output_init_mode_table (void);
>  void lto_prepare_function_for_streaming (cgraph_node *);
>  
> --- gcc/lto-streamer.c.jj	2020-07-28 15:39:09.963756846 +0200
> +++ gcc/lto-streamer.c	2020-09-02 14:19:23.624253409 +0200
> @@ -272,4 +272,5 @@ lto_streamer_hooks_init (void)
>    streamer_hooks.read_tree = lto_input_tree;
>    streamer_hooks.input_location = lto_input_location;
>    streamer_hooks.output_location = lto_output_location;
> +  streamer_hooks.output_location_and_block = lto_output_location_and_block;
>  }
> --- gcc/lto-streamer-in.c.jj	2020-08-27 18:42:35.604712155 +0200
> +++ gcc/lto-streamer-in.c	2020-09-02 15:23:12.454221865 +0200
> @@ -156,7 +156,18 @@ lto_location_cache::cmp_loc (const void
>      return a->sysp ? 1 : -1;
>    if (a->line != b->line)
>      return a->line - b->line;
> -  return a->col - b->col;
> +  if (a->col != b->col)
> +    return a->col - b->col;
> +  if ((a->block == NULL_TREE) != (b->block == NULL_TREE))
> +    return a->block ? 1 : -1;
> +  if (a->block)
> +    {
> +      if (BLOCK_NUMBER (a->block) < BLOCK_NUMBER (b->block))
> +	return -1;
> +      if (BLOCK_NUMBER (a->block) > BLOCK_NUMBER (b->block))
> +	return 1;
> +    }
> +  return 0;
>  }
>  
>  /* Apply all changes in location cache.  Add locations into linemap and patch
> @@ -191,22 +202,33 @@ lto_location_cache::apply_location_cache
>  	  linemap_line_start (line_table, loc.line, max + 1);
>  	}
>        gcc_assert (*loc.loc == BUILTINS_LOCATION + 1);
> -      if (current_file == loc.file && current_line == loc.line
> -	  && current_col == loc.col)
> -	*loc.loc = current_loc;
> -      else
> -        current_loc = *loc.loc = linemap_position_for_column (line_table,
> -							      loc.col);
> +      if (current_file != loc.file
> +	  || current_line != loc.line
> +	  || current_col != loc.col)
> +	{
> +	  current_loc = linemap_position_for_column (line_table, loc.col);
> +	  if (loc.block)
> +	    current_loc = set_block (current_loc, loc.block);
> +	}
> +      else if (current_block != loc.block)
> +	{
> +	  if (loc.block)
> +	    current_loc = set_block (current_loc, loc.block);
> +	  else
> +	    current_loc = LOCATION_LOCUS (current_loc);
> +	}
> +      *loc.loc = current_loc;
>        current_line = loc.line;
>        prev_file = current_file = loc.file;
>        current_col = loc.col;
> +      current_block = loc.block;
>      }
>    loc_cache.truncate (0);
>    accepted_length = 0;
>    return true;
>  }
>  
> -/* Tree merging did not suceed; mark all changes in the cache as accepted.  */
> +/* Tree merging did not succeed; mark all changes in the cache as accepted.  */
>  
>  void
>  lto_location_cache::accept_location_cache ()
> @@ -215,7 +237,7 @@ lto_location_cache::accept_location_cach
>    accepted_length = loc_cache.length ();
>  }
>  
> -/* Tree merging did suceed; throw away recent changes.  */
> +/* Tree merging did succeed; throw away recent changes.  */
>  
>  void
>  lto_location_cache::revert_location_cache ()
> @@ -223,33 +245,46 @@ lto_location_cache::revert_location_cach
>    loc_cache.truncate (accepted_length);
>  }
>  
> -/* Read a location bitpack from input block IB and either update *LOC directly
> -   or add it to the location cache.
> +/* Read a location bitpack from bit pack BP and either update *LOC directly
> +   or add it to the location cache.  If IB is non-NULL, stream in a block
> +   afterwards.
>     It is neccesary to call apply_location_cache to get *LOC updated.  */
>  
>  void
> -lto_location_cache::input_location (location_t *loc, struct bitpack_d *bp,
> -				    class data_in *data_in)
> +lto_location_cache::input_location_and_block (location_t *loc,
> +					      struct bitpack_d *bp,
> +					      class lto_input_block *ib,
> +					      class data_in *data_in)
>  {
>    static const char *stream_file;
>    static int stream_line;
>    static int stream_col;
>    static bool stream_sysp;
> -  bool file_change, line_change, column_change;
> +  static tree stream_block;
>  
>    gcc_assert (current_cache == this);
>  
>    *loc = bp_unpack_int_in_range (bp, "location", 0, RESERVED_LOCATION_COUNT);
>  
>    if (*loc < RESERVED_LOCATION_COUNT)
> -    return;
> +    {
> +      if (ib)
> +	{
> +	  bool block_change = bp_unpack_value (bp, 1);
> +	  if (block_change)
> +	    stream_block = stream_read_tree (ib, data_in);
> +	  if (stream_block)
> +	    *loc = set_block (*loc, stream_block);
> +	}
> +      return;
> +    }
>  
>    /* Keep value RESERVED_LOCATION_COUNT in *loc as linemap lookups will
>       ICE on it.  */
>  
> -  file_change = bp_unpack_value (bp, 1);
> -  line_change = bp_unpack_value (bp, 1);
> -  column_change = bp_unpack_value (bp, 1);
> +  bool file_change = bp_unpack_value (bp, 1);
> +  bool line_change = bp_unpack_value (bp, 1);
> +  bool column_change = bp_unpack_value (bp, 1);
>  
>    if (file_change)
>      {
> @@ -263,21 +298,48 @@ lto_location_cache::input_location (loca
>    if (column_change)
>      stream_col = bp_unpack_var_len_unsigned (bp);
>  
> -  /* This optimization saves location cache operations druing gimple
> +  tree block = NULL_TREE;
> +  if (ib)
> +    {
> +      bool block_change = bp_unpack_value (bp, 1);
> +      if (block_change)
> +	stream_block = stream_read_tree (ib, data_in);
> +      block = stream_block;
> +    }
> +
> +  /* This optimization saves location cache operations during gimple
>       streaming.  */
>       
> -  if (current_file == stream_file && current_line == stream_line
> -      && current_col == stream_col && current_sysp == stream_sysp)
> -    {
> -      *loc = current_loc;
> +  if (current_file == stream_file
> +      && current_line == stream_line
> +      && current_col == stream_col
> +      && current_sysp == stream_sysp)
> +    {
> +      if (current_block == block)
> +	*loc = current_loc;
> +      else if (block)
> +	*loc = set_block (current_loc, block);
> +      else
> +	*loc = LOCATION_LOCUS (current_loc);
>        return;
>      }
>  
>    struct cached_location entry
> -    = {stream_file, loc, stream_line, stream_col, stream_sysp};
> +    = {stream_file, loc, stream_line, stream_col, stream_sysp, block};
>    loc_cache.safe_push (entry);
>  }
>  
> +/* Read a location bitpack from bit pack BP and either update *LOC directly
> +   or add it to the location cache.
> +   It is neccesary to call apply_location_cache to get *LOC updated.  */
> +
> +void
> +lto_location_cache::input_location (location_t *loc, struct bitpack_d *bp,
> +				    class data_in *data_in)
> +{
> +  return input_location_and_block (loc, bp, NULL, data_in);
> +}
> +
>  /* Read a location bitpack from input block IB and either update *LOC directly
>     or add it to the location cache.
>     It is neccesary to call apply_location_cache to get *LOC updated.  */
> @@ -1101,6 +1163,9 @@ input_function (tree fn_decl, class data
>        tag = streamer_read_record_start (ib);
>      }
>  
> +  /* Finalize gimple_location/gimple_block of stmts and phis.  */
> +  data_in->location_cache.apply_location_cache ();
> +
>    /* Fix up the call statements that are mentioned in the callgraph
>       edges.  */
>    set_gimple_stmt_max_uid (cfun, 0);
> --- gcc/lto-streamer-out.c.jj	2020-07-28 15:39:09.963756846 +0200
> +++ gcc/lto-streamer-out.c	2020-09-02 15:16:27.634141378 +0200
> @@ -60,6 +60,10 @@ clear_line_info (struct output_block *ob
>    ob->current_line = 0;
>    ob->current_col = 0;
>    ob->current_sysp = false;
> +  /* Initialize to something that will never appear as block,
> +     so that the first location with block in a function etc.
> +     always streams a change_block bit and the first block.  */
> +  ob->current_block = void_node;
>  }
>  
>  
> @@ -178,40 +182,72 @@ tree_is_indexable (tree t)
>     After outputting bitpack, lto_output_location_data has
>     to be done to output actual data.  */
>  
> -void
> -lto_output_location (struct output_block *ob, struct bitpack_d *bp,
> -		     location_t loc)
> +static void
> +lto_output_location_1 (struct output_block *ob, struct bitpack_d *bp,
> +		       location_t orig_loc, bool block_p)
>  {
> -  expanded_location xloc;
> +  location_t loc = LOCATION_LOCUS (orig_loc);
>  
> -  loc = LOCATION_LOCUS (loc);
>    bp_pack_int_in_range (bp, 0, RESERVED_LOCATION_COUNT,
>  		        loc < RESERVED_LOCATION_COUNT
>  			? loc : RESERVED_LOCATION_COUNT);
> -  if (loc < RESERVED_LOCATION_COUNT)
> -    return;
> -
> -  xloc = expand_location (loc);
> +  if (loc >= RESERVED_LOCATION_COUNT)
> +    {
> +      expanded_location xloc = expand_location (loc);
>  
> -  bp_pack_value (bp, ob->current_file != xloc.file, 1);
> -  bp_pack_value (bp, ob->current_line != xloc.line, 1);
> -  bp_pack_value (bp, ob->current_col != xloc.column, 1);
> +      bp_pack_value (bp, ob->current_file != xloc.file, 1);
> +      bp_pack_value (bp, ob->current_line != xloc.line, 1);
> +      bp_pack_value (bp, ob->current_col != xloc.column, 1);
> +
> +      if (ob->current_file != xloc.file)
> +	{
> +	  bp_pack_string (ob, bp, remap_debug_filename (xloc.file), true);
> +	  bp_pack_value (bp, xloc.sysp, 1);
> +	}
> +      ob->current_file = xloc.file;
> +      ob->current_sysp = xloc.sysp;
> +
> +      if (ob->current_line != xloc.line)
> +	bp_pack_var_len_unsigned (bp, xloc.line);
> +      ob->current_line = xloc.line;
> +
> +      if (ob->current_col != xloc.column)
> +	bp_pack_var_len_unsigned (bp, xloc.column);
> +      ob->current_col = xloc.column;
> +    }
>  
> -  if (ob->current_file != xloc.file)
> +  if (block_p)
>      {
> -      bp_pack_string (ob, bp, remap_debug_filename (xloc.file), true);
> -      bp_pack_value (bp, xloc.sysp, 1);
> +      tree block = LOCATION_BLOCK (orig_loc);
> +      bp_pack_value (bp, ob->current_block != block, 1);
> +      streamer_write_bitpack (bp);
> +      if (ob->current_block != block)
> +	lto_output_tree (ob, block, true, true);
> +      ob->current_block = block;
>      }
> -  ob->current_file = xloc.file;
> -  ob->current_sysp = xloc.sysp;
> +}
> +
> +/* Output info about new location into bitpack BP.
> +   After outputting bitpack, lto_output_location_data has
> +   to be done to output actual data.  */
>  
> -  if (ob->current_line != xloc.line)
> -    bp_pack_var_len_unsigned (bp, xloc.line);
> -  ob->current_line = xloc.line;
> -
> -  if (ob->current_col != xloc.column)
> -    bp_pack_var_len_unsigned (bp, xloc.column);
> -  ob->current_col = xloc.column;
> +void
> +lto_output_location (struct output_block *ob, struct bitpack_d *bp,
> +		     location_t loc)
> +{
> +  lto_output_location_1 (ob, bp, loc, false);
> +}
> +
> +/* Output info about new location into bitpack BP.
> +   After outputting bitpack, lto_output_location_data has
> +   to be done to output actual data.  Like lto_output_location, but
> +   additionally output LOCATION_BLOCK info too and write the BP bitpack.  */
> +
> +void
> +lto_output_location_and_block (struct output_block *ob, struct bitpack_d *bp,
> +			       location_t loc)
> +{
> +  lto_output_location_1 (ob, bp, loc, true);
>  }
>  
>  
> --- gcc/gimple-streamer-in.c.jj	2020-01-12 11:54:36.625410210 +0100
> +++ gcc/gimple-streamer-in.c	2020-09-02 13:53:39.716871790 +0200
> @@ -57,11 +57,7 @@ input_phi (class lto_input_block *ib, ba
>        tree def = stream_read_tree (ib, data_in);
>        int src_index = streamer_read_uhwi (ib);
>        bitpack_d bp = streamer_read_bitpack (ib);
> -      /* Do not cache a location - we do not have API to get pointer to the
> -	 location in PHI statement and we may trigger reallocation.  */
> -      location_t arg_loc = stream_input_location_now (&bp, data_in);
>        basic_block sbb = BASIC_BLOCK_FOR_FN (fn, src_index);
> -
>        edge e = NULL;
>        int j;
>  
> @@ -72,7 +68,11 @@ input_phi (class lto_input_block *ib, ba
>  	    break;
>  	  }
>  
> -      add_phi_arg (result, def, e, arg_loc);
> +      add_phi_arg (result, def, e, UNKNOWN_LOCATION);
> +      /* Read location and lexical block information.  */
> +      location_t *arg_locp = gimple_phi_arg_location_ptr (result, e->dest_idx);
> +      data_in->location_cache.input_location_and_block (arg_locp, &bp, ib,
> +							data_in);
>      }
>  
>    return result;
> @@ -106,12 +106,9 @@ input_gimple_stmt (class lto_input_block
>    has_hist = bp_unpack_value (&bp, 1);
>    stmt->subcode = bp_unpack_var_len_unsigned (&bp);
>  
> -  /* Read location information.  Caching here makes no sense until streamer
> -     cache can handle the following gimple_set_block.  */
> -  gimple_set_location (stmt, stream_input_location_now (&bp, data_in));
> -
> -  /* Read lexical block reference.  */
> -  gimple_set_block (stmt, stream_read_tree (ib, data_in));
> +  /* Read location and lexical block information.  */
> +  data_in->location_cache.input_location_and_block (gimple_location_ptr (stmt),
> +						    &bp, ib, data_in);
>  
>    /* Read in all the operands.  */
>    switch (code)
> --- gcc/gimple-streamer-out.c.jj	2020-01-12 11:54:36.625410210 +0100
> +++ gcc/gimple-streamer-out.c	2020-09-02 14:34:20.360123419 +0200
> @@ -48,8 +48,8 @@ output_phi (struct output_block *ob, gph
>        stream_write_tree (ob, gimple_phi_arg_def (phi, i), true);
>        streamer_write_uhwi (ob, gimple_phi_arg_edge (phi, i)->src->index);
>        bitpack_d bp = bitpack_create (ob->main_stream);
> -      stream_output_location (ob, &bp, gimple_phi_arg_location (phi, i));
> -      streamer_write_bitpack (&bp);
> +      location_t loc = gimple_phi_arg_location (phi, i);
> +      stream_output_location_and_block (ob, &bp, loc);
>      }
>  }
>  
> @@ -84,12 +84,8 @@ output_gimple_stmt (struct output_block
>    bp_pack_value (&bp, hist != NULL, 1);
>    bp_pack_var_len_unsigned (&bp, stmt->subcode);
>  
> -  /* Emit location information for the statement.  */
> -  stream_output_location (ob, &bp, LOCATION_LOCUS (gimple_location (stmt)));
> -  streamer_write_bitpack (&bp);
> -
> -  /* Emit the lexical block holding STMT.  */
> -  stream_write_tree (ob, gimple_block (stmt), true);
> +  /* Emit location information for the statement, including gimple_block.  */
> +  stream_output_location_and_block (ob, &bp, gimple_location (stmt));
>  
>    /* Emit the operands.  */
>    switch (gimple_code (stmt))
> 
> 	Jakub
> 
>
Jakub Jelinek Sept. 3, 2020, 10:28 a.m. UTC | #2
On Thu, Sep 03, 2020 at 12:07:36PM +0200, Richard Biener wrote:
> LGTM.

Thanks.

> Can we now remove stream_input_location_now?

There are a few remaining users, one is:
      case LTO_ert_must_not_throw:
        {
          r->type = ERT_MUST_NOT_THROW;
          r->u.must_not_throw.failure_decl = stream_read_tree (ib, data_in);
          bitpack_d bp = streamer_read_bitpack (ib);
          r->u.must_not_throw.failure_loc
           = stream_input_location_now (&bp, data_in);
        }
and the other two:
  /* Input the function start and end loci.  */
  fn->function_start_locus = stream_input_location_now (&bp, data_in);
  fn->function_end_locus = stream_input_location_now (&bp, data_in);
I know nothing about those, so have kept them as is.  I don't know if
failure_loc can or can't include blocks, what is the reason why it uses
*_now (can r be reallocated?) and similarly about the function start/end.
Will defer these to Honza or anybody else who knows LTO better.

Ok, looking some more, seems r isn't really reallocated and failure_loc
shouldn't really include blocks, judging from
    case ERT_MUST_NOT_THROW:
      new_r->u.must_not_throw.failure_loc =
        LOCATION_LOCUS (old_r->u.must_not_throw.failure_loc);
and
      this_region->u.must_not_throw.failure_loc
        = LOCATION_LOCUS (gimple_location (tp));
as the only setters of failure_loc outside of lto-streamer-in.c.
Dunno if it would be sufficient to just use normal stream_input_location
and let the apply cache after all EH regions and later all bbs are input
apply it.

> > all these != checks to non-zero?  Either by oring something into those
> > tests, or perhaps:
> >   if (ob->current_reset)
> >     {
> >       if (xloc.file == NULL)
> > 	ob->current_file = "";
> >       if (xloc.line == 0)
> > 	ob->current_line = 1;
> >       if (xloc.column == 0)
> > 	ob->current_column = 1;
> >       ob->current_reset = false;
> >     }
> > before doing those bp_pack_value calls with a comment, effectively forcing
> > all 6 != comparisons to be true?
> 
> Yeah, guess that would be cleanest.  How does UNKNOWN_LOCATION
> expand btw?  Using that as "reset" value also might work?

On the lto-streamer-out.c, for UNKNOWN_LOCATION/BUILTINS_LOCATION we just
do:
  bp_pack_int_in_range (bp, 0, RESERVED_LOCATION_COUNT,
                        loc < RESERVED_LOCATION_COUNT
                        ? loc : RESERVED_LOCATION_COUNT);
and that is it (well, for locs with blocks I do more now, but
ob->current_{file,line,column,sysp} aren't touched).

I'll test a patch.

	Jakub
Richard Biener Sept. 3, 2020, 10:44 a.m. UTC | #3
On Thu, 3 Sep 2020, Jakub Jelinek wrote:

> On Thu, Sep 03, 2020 at 12:07:36PM +0200, Richard Biener wrote:
> > LGTM.
> 
> Thanks.
> 
> > Can we now remove stream_input_location_now?
> 
> There are a few remaining users, one is:
>       case LTO_ert_must_not_throw:
>         {
>           r->type = ERT_MUST_NOT_THROW;
>           r->u.must_not_throw.failure_decl = stream_read_tree (ib, data_in);
>           bitpack_d bp = streamer_read_bitpack (ib);
>           r->u.must_not_throw.failure_loc
>            = stream_input_location_now (&bp, data_in);
>         }
> and the other two:
>   /* Input the function start and end loci.  */
>   fn->function_start_locus = stream_input_location_now (&bp, data_in);
>   fn->function_end_locus = stream_input_location_now (&bp, data_in);
> I know nothing about those, so have kept them as is.

I think it's safe to share locations with the rest of the IL and the EH
tree so just replacing them should be fine.

>  I don't know if
> failure_loc can or can't include blocks, what is the reason why it uses
> *_now (can r be reallocated?) and similarly about the function start/end.
> Will defer these to Honza or anybody else who knows LTO better.
> 
> Ok, looking some more, seems r isn't really reallocated and failure_loc
> shouldn't really include blocks, judging from
>     case ERT_MUST_NOT_THROW:
>       new_r->u.must_not_throw.failure_loc =
>         LOCATION_LOCUS (old_r->u.must_not_throw.failure_loc);
> and
>       this_region->u.must_not_throw.failure_loc
>         = LOCATION_LOCUS (gimple_location (tp));
> as the only setters of failure_loc outside of lto-streamer-in.c.
> Dunno if it would be sufficient to just use normal stream_input_location
> and let the apply cache after all EH regions and later all bbs are input
> apply it.

I think so.

> > > all these != checks to non-zero?  Either by oring something into those
> > > tests, or perhaps:
> > >   if (ob->current_reset)
> > >     {
> > >       if (xloc.file == NULL)
> > > 	ob->current_file = "";
> > >       if (xloc.line == 0)
> > > 	ob->current_line = 1;
> > >       if (xloc.column == 0)
> > > 	ob->current_column = 1;
> > >       ob->current_reset = false;
> > >     }
> > > before doing those bp_pack_value calls with a comment, effectively forcing
> > > all 6 != comparisons to be true?
> > 
> > Yeah, guess that would be cleanest.  How does UNKNOWN_LOCATION
> > expand btw?  Using that as "reset" value also might work?
> 
> On the lto-streamer-out.c, for UNKNOWN_LOCATION/BUILTINS_LOCATION we just
> do:
>   bp_pack_int_in_range (bp, 0, RESERVED_LOCATION_COUNT,
>                         loc < RESERVED_LOCATION_COUNT
>                         ? loc : RESERVED_LOCATION_COUNT);
> and that is it (well, for locs with blocks I do more now, but
> ob->current_{file,line,column,sysp} aren't touched).
> 
> I'll test a patch.

Thanks,
Richard.
diff mbox series

Patch

--- gcc/gimple.h.jj	2020-08-03 22:54:51.426531549 +0200
+++ gcc/gimple.h	2020-09-02 13:51:38.841642793 +0200
@@ -1879,6 +1879,14 @@  gimple_set_location (gimple *g, location
   g->location = location;
 }
 
+/* Return address of the location information for statement G.  */
+
+static inline location_t *
+gimple_location_ptr (gimple *g)
+{
+  return &g->location;
+}
+
 
 /* Return true if G contains location information.  */
 
@@ -4581,6 +4589,14 @@  gimple_phi_arg_set_location (gphi *phi,
   gimple_phi_arg (phi, i)->locus = loc;
 }
 
+/* Return address of source location of gimple argument I of phi node PHI.  */
+
+static inline location_t *
+gimple_phi_arg_location_ptr (gphi *phi, size_t i)
+{
+  return &gimple_phi_arg (phi, i)->locus;
+}
+
 /* Return TRUE if argument I of phi node PHI has a location record.  */
 
 static inline bool
--- gcc/streamer-hooks.h.jj	2020-01-12 11:54:36.936405518 +0100
+++ gcc/streamer-hooks.h	2020-09-02 14:18:03.304429642 +0200
@@ -54,8 +54,15 @@  struct streamer_hooks {
   /* [REQ] Called by every streaming routine that needs to read a location.  */
   void (*input_location) (location_t *, struct bitpack_d *, class data_in *);
 
-  /* [REQ] Called by every streaming routine that needs to write a location.  */
-  void (*output_location) (struct output_block *, struct bitpack_d *, location_t);
+  /* [REQ] Called by every streaming routine that needs to write a
+     location.  */
+  void (*output_location) (struct output_block *, struct bitpack_d *,
+			   location_t);
+
+  /* [REQ] Called by every streaming routine that needs to write a
+     location, both LOCATION_LOCUS and LOCATION_BLOCK.  */
+  void (*output_location_and_block) (struct output_block *, struct bitpack_d *,
+				     location_t);
 };
 
 #define stream_write_tree(OB, EXPR, REF_P) \
@@ -73,6 +80,9 @@  struct streamer_hooks {
 #define stream_output_location(OB, BP, LOC) \
     streamer_hooks.output_location (OB, BP, LOC)
 
+#define stream_output_location_and_block(OB, BP, LOC) \
+    streamer_hooks.output_location_and_block (OB, BP, LOC)
+
 /* Streamer hooks.  */
 extern struct streamer_hooks streamer_hooks;
 
--- gcc/lto-streamer.h.jj	2020-08-03 22:54:51.429531506 +0200
+++ gcc/lto-streamer.h	2020-09-02 14:21:13.829639521 +0200
@@ -262,7 +262,7 @@  typedef void (lto_free_section_data_f) (
 
 /* The location cache holds expanded locations for streamed in trees.
    This is done to reduce memory usage of libcpp linemap that strongly prefers
-   locations to be inserted in the soruce order.  */
+   locations to be inserted in the source order.  */
 
 class lto_location_cache
 {
@@ -276,9 +276,13 @@  public:
   void revert_location_cache ();
   void input_location (location_t *loc, struct bitpack_d *bp,
 		       class data_in *data_in);
+  void input_location_and_block (location_t *loc, struct bitpack_d *bp,
+				 class lto_input_block *ib,
+				 class data_in *data_in);
   lto_location_cache ()
      : loc_cache (), accepted_length (0), current_file (NULL), current_line (0),
-       current_col (0), current_sysp (false), current_loc (UNKNOWN_LOCATION)
+       current_col (0), current_sysp (false), current_loc (UNKNOWN_LOCATION),
+       current_block (NULL_TREE)
   {
     gcc_assert (!current_cache);
     current_cache = this;
@@ -304,6 +308,7 @@  private:
     location_t *loc;
     int line, col;
     bool sysp;
+    tree block;
   };
 
   /* The location cache.  */
@@ -325,6 +330,7 @@  private:
   int current_col;
   bool current_sysp;
   location_t current_loc;
+  tree current_block;
 };
 
 /* Structure used as buffer for reading an LTO file.  */
@@ -711,6 +717,7 @@  struct output_block
   int current_line;
   int current_col;
   bool current_sysp;
+  tree current_block;
 
   /* Cache of nodes written in this section.  */
   struct streamer_tree_cache_d *writer_cache;
@@ -881,7 +888,10 @@  void lto_output_decl_state_streams (stru
 void lto_output_decl_state_refs (struct output_block *,
 			         struct lto_output_stream *,
 			         struct lto_out_decl_state *);
-void lto_output_location (struct output_block *, struct bitpack_d *, location_t);
+void lto_output_location (struct output_block *, struct bitpack_d *,
+			  location_t);
+void lto_output_location_and_block (struct output_block *, struct bitpack_d *,
+				    location_t);
 void lto_output_init_mode_table (void);
 void lto_prepare_function_for_streaming (cgraph_node *);
 
--- gcc/lto-streamer.c.jj	2020-07-28 15:39:09.963756846 +0200
+++ gcc/lto-streamer.c	2020-09-02 14:19:23.624253409 +0200
@@ -272,4 +272,5 @@  lto_streamer_hooks_init (void)
   streamer_hooks.read_tree = lto_input_tree;
   streamer_hooks.input_location = lto_input_location;
   streamer_hooks.output_location = lto_output_location;
+  streamer_hooks.output_location_and_block = lto_output_location_and_block;
 }
--- gcc/lto-streamer-in.c.jj	2020-08-27 18:42:35.604712155 +0200
+++ gcc/lto-streamer-in.c	2020-09-02 15:23:12.454221865 +0200
@@ -156,7 +156,18 @@  lto_location_cache::cmp_loc (const void
     return a->sysp ? 1 : -1;
   if (a->line != b->line)
     return a->line - b->line;
-  return a->col - b->col;
+  if (a->col != b->col)
+    return a->col - b->col;
+  if ((a->block == NULL_TREE) != (b->block == NULL_TREE))
+    return a->block ? 1 : -1;
+  if (a->block)
+    {
+      if (BLOCK_NUMBER (a->block) < BLOCK_NUMBER (b->block))
+	return -1;
+      if (BLOCK_NUMBER (a->block) > BLOCK_NUMBER (b->block))
+	return 1;
+    }
+  return 0;
 }
 
 /* Apply all changes in location cache.  Add locations into linemap and patch
@@ -191,22 +202,33 @@  lto_location_cache::apply_location_cache
 	  linemap_line_start (line_table, loc.line, max + 1);
 	}
       gcc_assert (*loc.loc == BUILTINS_LOCATION + 1);
-      if (current_file == loc.file && current_line == loc.line
-	  && current_col == loc.col)
-	*loc.loc = current_loc;
-      else
-        current_loc = *loc.loc = linemap_position_for_column (line_table,
-							      loc.col);
+      if (current_file != loc.file
+	  || current_line != loc.line
+	  || current_col != loc.col)
+	{
+	  current_loc = linemap_position_for_column (line_table, loc.col);
+	  if (loc.block)
+	    current_loc = set_block (current_loc, loc.block);
+	}
+      else if (current_block != loc.block)
+	{
+	  if (loc.block)
+	    current_loc = set_block (current_loc, loc.block);
+	  else
+	    current_loc = LOCATION_LOCUS (current_loc);
+	}
+      *loc.loc = current_loc;
       current_line = loc.line;
       prev_file = current_file = loc.file;
       current_col = loc.col;
+      current_block = loc.block;
     }
   loc_cache.truncate (0);
   accepted_length = 0;
   return true;
 }
 
-/* Tree merging did not suceed; mark all changes in the cache as accepted.  */
+/* Tree merging did not succeed; mark all changes in the cache as accepted.  */
 
 void
 lto_location_cache::accept_location_cache ()
@@ -215,7 +237,7 @@  lto_location_cache::accept_location_cach
   accepted_length = loc_cache.length ();
 }
 
-/* Tree merging did suceed; throw away recent changes.  */
+/* Tree merging did succeed; throw away recent changes.  */
 
 void
 lto_location_cache::revert_location_cache ()
@@ -223,33 +245,46 @@  lto_location_cache::revert_location_cach
   loc_cache.truncate (accepted_length);
 }
 
-/* Read a location bitpack from input block IB and either update *LOC directly
-   or add it to the location cache.
+/* Read a location bitpack from bit pack BP and either update *LOC directly
+   or add it to the location cache.  If IB is non-NULL, stream in a block
+   afterwards.
    It is neccesary to call apply_location_cache to get *LOC updated.  */
 
 void
-lto_location_cache::input_location (location_t *loc, struct bitpack_d *bp,
-				    class data_in *data_in)
+lto_location_cache::input_location_and_block (location_t *loc,
+					      struct bitpack_d *bp,
+					      class lto_input_block *ib,
+					      class data_in *data_in)
 {
   static const char *stream_file;
   static int stream_line;
   static int stream_col;
   static bool stream_sysp;
-  bool file_change, line_change, column_change;
+  static tree stream_block;
 
   gcc_assert (current_cache == this);
 
   *loc = bp_unpack_int_in_range (bp, "location", 0, RESERVED_LOCATION_COUNT);
 
   if (*loc < RESERVED_LOCATION_COUNT)
-    return;
+    {
+      if (ib)
+	{
+	  bool block_change = bp_unpack_value (bp, 1);
+	  if (block_change)
+	    stream_block = stream_read_tree (ib, data_in);
+	  if (stream_block)
+	    *loc = set_block (*loc, stream_block);
+	}
+      return;
+    }
 
   /* Keep value RESERVED_LOCATION_COUNT in *loc as linemap lookups will
      ICE on it.  */
 
-  file_change = bp_unpack_value (bp, 1);
-  line_change = bp_unpack_value (bp, 1);
-  column_change = bp_unpack_value (bp, 1);
+  bool file_change = bp_unpack_value (bp, 1);
+  bool line_change = bp_unpack_value (bp, 1);
+  bool column_change = bp_unpack_value (bp, 1);
 
   if (file_change)
     {
@@ -263,21 +298,48 @@  lto_location_cache::input_location (loca
   if (column_change)
     stream_col = bp_unpack_var_len_unsigned (bp);
 
-  /* This optimization saves location cache operations druing gimple
+  tree block = NULL_TREE;
+  if (ib)
+    {
+      bool block_change = bp_unpack_value (bp, 1);
+      if (block_change)
+	stream_block = stream_read_tree (ib, data_in);
+      block = stream_block;
+    }
+
+  /* This optimization saves location cache operations during gimple
      streaming.  */
      
-  if (current_file == stream_file && current_line == stream_line
-      && current_col == stream_col && current_sysp == stream_sysp)
-    {
-      *loc = current_loc;
+  if (current_file == stream_file
+      && current_line == stream_line
+      && current_col == stream_col
+      && current_sysp == stream_sysp)
+    {
+      if (current_block == block)
+	*loc = current_loc;
+      else if (block)
+	*loc = set_block (current_loc, block);
+      else
+	*loc = LOCATION_LOCUS (current_loc);
       return;
     }
 
   struct cached_location entry
-    = {stream_file, loc, stream_line, stream_col, stream_sysp};
+    = {stream_file, loc, stream_line, stream_col, stream_sysp, block};
   loc_cache.safe_push (entry);
 }
 
+/* Read a location bitpack from bit pack BP and either update *LOC directly
+   or add it to the location cache.
+   It is neccesary to call apply_location_cache to get *LOC updated.  */
+
+void
+lto_location_cache::input_location (location_t *loc, struct bitpack_d *bp,
+				    class data_in *data_in)
+{
+  return input_location_and_block (loc, bp, NULL, data_in);
+}
+
 /* Read a location bitpack from input block IB and either update *LOC directly
    or add it to the location cache.
    It is neccesary to call apply_location_cache to get *LOC updated.  */
@@ -1101,6 +1163,9 @@  input_function (tree fn_decl, class data
       tag = streamer_read_record_start (ib);
     }
 
+  /* Finalize gimple_location/gimple_block of stmts and phis.  */
+  data_in->location_cache.apply_location_cache ();
+
   /* Fix up the call statements that are mentioned in the callgraph
      edges.  */
   set_gimple_stmt_max_uid (cfun, 0);
--- gcc/lto-streamer-out.c.jj	2020-07-28 15:39:09.963756846 +0200
+++ gcc/lto-streamer-out.c	2020-09-02 15:16:27.634141378 +0200
@@ -60,6 +60,10 @@  clear_line_info (struct output_block *ob
   ob->current_line = 0;
   ob->current_col = 0;
   ob->current_sysp = false;
+  /* Initialize to something that will never appear as block,
+     so that the first location with block in a function etc.
+     always streams a change_block bit and the first block.  */
+  ob->current_block = void_node;
 }
 
 
@@ -178,40 +182,72 @@  tree_is_indexable (tree t)
    After outputting bitpack, lto_output_location_data has
    to be done to output actual data.  */
 
-void
-lto_output_location (struct output_block *ob, struct bitpack_d *bp,
-		     location_t loc)
+static void
+lto_output_location_1 (struct output_block *ob, struct bitpack_d *bp,
+		       location_t orig_loc, bool block_p)
 {
-  expanded_location xloc;
+  location_t loc = LOCATION_LOCUS (orig_loc);
 
-  loc = LOCATION_LOCUS (loc);
   bp_pack_int_in_range (bp, 0, RESERVED_LOCATION_COUNT,
 		        loc < RESERVED_LOCATION_COUNT
 			? loc : RESERVED_LOCATION_COUNT);
-  if (loc < RESERVED_LOCATION_COUNT)
-    return;
-
-  xloc = expand_location (loc);
+  if (loc >= RESERVED_LOCATION_COUNT)
+    {
+      expanded_location xloc = expand_location (loc);
 
-  bp_pack_value (bp, ob->current_file != xloc.file, 1);
-  bp_pack_value (bp, ob->current_line != xloc.line, 1);
-  bp_pack_value (bp, ob->current_col != xloc.column, 1);
+      bp_pack_value (bp, ob->current_file != xloc.file, 1);
+      bp_pack_value (bp, ob->current_line != xloc.line, 1);
+      bp_pack_value (bp, ob->current_col != xloc.column, 1);
+
+      if (ob->current_file != xloc.file)
+	{
+	  bp_pack_string (ob, bp, remap_debug_filename (xloc.file), true);
+	  bp_pack_value (bp, xloc.sysp, 1);
+	}
+      ob->current_file = xloc.file;
+      ob->current_sysp = xloc.sysp;
+
+      if (ob->current_line != xloc.line)
+	bp_pack_var_len_unsigned (bp, xloc.line);
+      ob->current_line = xloc.line;
+
+      if (ob->current_col != xloc.column)
+	bp_pack_var_len_unsigned (bp, xloc.column);
+      ob->current_col = xloc.column;
+    }
 
-  if (ob->current_file != xloc.file)
+  if (block_p)
     {
-      bp_pack_string (ob, bp, remap_debug_filename (xloc.file), true);
-      bp_pack_value (bp, xloc.sysp, 1);
+      tree block = LOCATION_BLOCK (orig_loc);
+      bp_pack_value (bp, ob->current_block != block, 1);
+      streamer_write_bitpack (bp);
+      if (ob->current_block != block)
+	lto_output_tree (ob, block, true, true);
+      ob->current_block = block;
     }
-  ob->current_file = xloc.file;
-  ob->current_sysp = xloc.sysp;
+}
+
+/* Output info about new location into bitpack BP.
+   After outputting bitpack, lto_output_location_data has
+   to be done to output actual data.  */
 
-  if (ob->current_line != xloc.line)
-    bp_pack_var_len_unsigned (bp, xloc.line);
-  ob->current_line = xloc.line;
-
-  if (ob->current_col != xloc.column)
-    bp_pack_var_len_unsigned (bp, xloc.column);
-  ob->current_col = xloc.column;
+void
+lto_output_location (struct output_block *ob, struct bitpack_d *bp,
+		     location_t loc)
+{
+  lto_output_location_1 (ob, bp, loc, false);
+}
+
+/* Output info about new location into bitpack BP.
+   After outputting bitpack, lto_output_location_data has
+   to be done to output actual data.  Like lto_output_location, but
+   additionally output LOCATION_BLOCK info too and write the BP bitpack.  */
+
+void
+lto_output_location_and_block (struct output_block *ob, struct bitpack_d *bp,
+			       location_t loc)
+{
+  lto_output_location_1 (ob, bp, loc, true);
 }
 
 
--- gcc/gimple-streamer-in.c.jj	2020-01-12 11:54:36.625410210 +0100
+++ gcc/gimple-streamer-in.c	2020-09-02 13:53:39.716871790 +0200
@@ -57,11 +57,7 @@  input_phi (class lto_input_block *ib, ba
       tree def = stream_read_tree (ib, data_in);
       int src_index = streamer_read_uhwi (ib);
       bitpack_d bp = streamer_read_bitpack (ib);
-      /* Do not cache a location - we do not have API to get pointer to the
-	 location in PHI statement and we may trigger reallocation.  */
-      location_t arg_loc = stream_input_location_now (&bp, data_in);
       basic_block sbb = BASIC_BLOCK_FOR_FN (fn, src_index);
-
       edge e = NULL;
       int j;
 
@@ -72,7 +68,11 @@  input_phi (class lto_input_block *ib, ba
 	    break;
 	  }
 
-      add_phi_arg (result, def, e, arg_loc);
+      add_phi_arg (result, def, e, UNKNOWN_LOCATION);
+      /* Read location and lexical block information.  */
+      location_t *arg_locp = gimple_phi_arg_location_ptr (result, e->dest_idx);
+      data_in->location_cache.input_location_and_block (arg_locp, &bp, ib,
+							data_in);
     }
 
   return result;
@@ -106,12 +106,9 @@  input_gimple_stmt (class lto_input_block
   has_hist = bp_unpack_value (&bp, 1);
   stmt->subcode = bp_unpack_var_len_unsigned (&bp);
 
-  /* Read location information.  Caching here makes no sense until streamer
-     cache can handle the following gimple_set_block.  */
-  gimple_set_location (stmt, stream_input_location_now (&bp, data_in));
-
-  /* Read lexical block reference.  */
-  gimple_set_block (stmt, stream_read_tree (ib, data_in));
+  /* Read location and lexical block information.  */
+  data_in->location_cache.input_location_and_block (gimple_location_ptr (stmt),
+						    &bp, ib, data_in);
 
   /* Read in all the operands.  */
   switch (code)
--- gcc/gimple-streamer-out.c.jj	2020-01-12 11:54:36.625410210 +0100
+++ gcc/gimple-streamer-out.c	2020-09-02 14:34:20.360123419 +0200
@@ -48,8 +48,8 @@  output_phi (struct output_block *ob, gph
       stream_write_tree (ob, gimple_phi_arg_def (phi, i), true);
       streamer_write_uhwi (ob, gimple_phi_arg_edge (phi, i)->src->index);
       bitpack_d bp = bitpack_create (ob->main_stream);
-      stream_output_location (ob, &bp, gimple_phi_arg_location (phi, i));
-      streamer_write_bitpack (&bp);
+      location_t loc = gimple_phi_arg_location (phi, i);
+      stream_output_location_and_block (ob, &bp, loc);
     }
 }
 
@@ -84,12 +84,8 @@  output_gimple_stmt (struct output_block
   bp_pack_value (&bp, hist != NULL, 1);
   bp_pack_var_len_unsigned (&bp, stmt->subcode);
 
-  /* Emit location information for the statement.  */
-  stream_output_location (ob, &bp, LOCATION_LOCUS (gimple_location (stmt)));
-  streamer_write_bitpack (&bp);
-
-  /* Emit the lexical block holding STMT.  */
-  stream_write_tree (ob, gimple_block (stmt), true);
+  /* Emit location information for the statement, including gimple_block.  */
+  stream_output_location_and_block (ob, &bp, gimple_location (stmt));
 
   /* Emit the operands.  */
   switch (gimple_code (stmt))