diff mbox

[3/3] When remapping paths, only match whole path components

Message ID 20170411113446.15683-4-infinity0@pwned.gg
State New
Headers show

Commit Message

Ximin Luo April 11, 2017, 11:34 a.m. UTC
Change the remapping algorithm so that each old_prefix only matches paths that
have old_prefix as a whole path component prefix.  (A whole path component is a
part of a path that begins and ends at a directory separator or at either end
of the path string.)

This remapping algorithm is more predictable than the old algorithm, because
there is no chance of mappings for one directory interfering with mappings for
other directories.  It contains less corner cases and is therefore nicer for
clients to use.  For these reasons, in our BUILD_PATH_PREFIX_MAP specification
we recommend this algorithm, and it would be good for GCC to follow suit.

This does technically break backwards compatibility but I don't think anyone
would be reasonably depending on the corner cases of the previous algorithm,
which are surprising and counterintuitive.

Acknowledgements
----------------

Discussions with Michael Woerister and other members of the Rust compiler team
on Github, and discussions with Daniel Shahaf on the rb-general@ mailing list
on lists.reproducible-builds.org.

ChangeLogs
----------

gcc/ChangeLog:

2017-04-09  Ximin Luo  <infinity0@pwned.gg>

	* doc/invoke.texi (Environment Variables): Document form and behaviour
	of BUILD_PATH_PREFIX_MAP.

libiberty/ChangeLog:

2017-04-09  Ximin Luo  <infinity0@pwned.gg>

	* prefix-map.c: When remapping paths, only match whole path components.
diff mbox

Patch

Index: gcc-7-20170409/gcc/doc/invoke.texi
===================================================================
--- gcc-7-20170409.orig/gcc/doc/invoke.texi
+++ gcc-7-20170409/gcc/doc/invoke.texi
@@ -26637,6 +26637,26 @@  Recognize EUCJP characters.
 If @env{LANG} is not defined, or if it has some other value, then the
 compiler uses @code{mblen} and @code{mbtowc} as defined by the default locale to
 recognize and translate multibyte characters.
+
+@item BUILD_PATH_PREFIX_MAP
+@findex BUILD_PATH_PREFIX_MAP
+If this variable is set, it specifies an ordered map used to transform
+filepaths output in debugging symbols and expansions of the @code{__FILE__}
+macro.  This may be used to achieve fully reproducible output.  In the context
+of running GCC within a higher-level build tool, it is typically more reliable
+than setting command line arguments such as @option{-fdebug-prefix-map} or
+common environment variables such as @env{CFLAGS}, since the build tool may
+save these latter values into other output outside of GCC's control.
+
+The value is of the form
+@samp{@var{dst@r{[0]}}=@var{src@r{[0]}}:@var{dst@r{[1]}}=@var{src@r{[1]}}@r{@dots{}}}.
+If any @var{dst@r{[}i@r{]}} or @var{src@r{[}i@r{]}} contains @code{%}, @code{=}
+or @code{:} characters, they must be replaced with @code{%#}, @code{%+}, and
+@code{%.} respectively.
+
+Whenever GCC emits a filepath that starts with a whole path component matching
+@var{src@r{[}i@r{]}} for some @var{i}, with rightmost @var{i} taking priority,
+the matching part is replaced with @var{dst@r{[}i@r{]}} in the final output.
 @end table
 
 @noindent
Index: gcc-7-20170409/libiberty/prefix-map.c
===================================================================
--- gcc-7-20170409.orig/libiberty/prefix-map.c
+++ gcc-7-20170409/libiberty/prefix-map.c
@@ -87,12 +87,22 @@  struct prefix_map *
 prefix_map_find (struct prefix_map *map, const char *old_name,
 		 const char **suffix, size_t *suf_len)
 {
+  size_t len;
+
   for (; map; map = map->next)
-    if (filename_ncmp (old_name, map->old_prefix, map->old_len) == 0)
-      {
-	*suf_len = strlen (*suffix = old_name + map->old_len);
-	break;
-      }
+    {
+      len = map->old_len;
+      /* Ignore trailing path separators at the end of old_prefix */
+      while (len > 0 && IS_DIR_SEPARATOR (map->old_prefix[len-1])) len--;
+      /* Check if old_name matches old_prefix at a path component boundary */
+      if (! filename_ncmp (old_name, map->old_prefix, len)
+	  && (IS_DIR_SEPARATOR (old_name[len])
+	      || old_name[len] == '\0'))
+	{
+	  *suf_len = strlen (*suffix = old_name + len);
+	  break;
+	}
+    }
 
   return map;
 }