diff mbox

gcc: read -fdebug-prefix-map OLD from environment (improved reproducibility)

Message ID 1449768978-3872-1-git-send-email-dkg@fifthhorseman.net
State New
Headers show

Commit Message

Daniel Kahn Gillmor Dec. 10, 2015, 5:36 p.m. UTC
Work on the reproducible-builds project [0] has identified that build
paths are one cause of output variation between builds.  This
changeset allows users to avoid this variation when building C objects
with debug symbols, while leaving the default behavior unchanged.

Background
----------

gcc includes the build path in any generated DWARF debugging symbols,
specifically in DW_AT_comp_dir, but allows the embedded path to be
changed via -fdebug-prefix-map.

When -fdebug-prefix-map is used with the current build path, it
removes the build path from DW_AT_comp_dir but places it instead in
DW_AT_producer, so the reproducibility problem isn't resolved.

When building software for binary redistribution, the actual build
path on the build machine is irrelevant, and doesn't need to be
exposed in the debug symbols.

Resolution
----------

This patch extends the first argument to -fdebug-prefix-map ("old") to
be able to read from the environment, which allows a packager to avoid
embedded build paths in the debugging symbols with something like:

  export SOURCE_BUILD_DIR="$(pwd)"
  gcc -fdebug-prefix-map=\$SOURCE_BUILD_DIR=/usr/src

Details
-------

Specifically, if the first character of the "old" argument is a
literal $, then gcc will treat it as an environment variable name, and
use the value of the env var for prefix mapping.

As a result, DW_AT_producer contains the literal envvar name,
DW_AT_comp_dir contains the transformed build path, and the actual
build path is not at all present in the generated object file.

This has been tested successfully on amd64 machines, and i see no
reason why it would be platform-specific.

More discussion of alternate approaches considered and discarded in
the development of this change can be found at [1] for those
interested.

Feedback welcome!

[0] https://reproducible-builds.org
[1] https://lists.alioth.debian.org/pipermail/reproducible-builds/Week-of-Mon-20151130/004051.html
---
 gcc/doc/invoke.texi |  4 +++-
 gcc/final.c         | 27 +++++++++++++++++++++++++--
 2 files changed, 28 insertions(+), 3 deletions(-)

Comments

Daniel Kahn Gillmor Dec. 10, 2015, 10:05 p.m. UTC | #1
On Thu 2015-12-10 12:36:18 -0500, Daniel Kahn Gillmor wrote:
> Work on the reproducible-builds project [0] has identified that build
> paths are one cause of output variation between builds.  This
> changeset allows users to avoid this variation when building C objects
> with debug symbols, while leaving the default behavior unchanged.

I've now opened a bugzilla issue about this as well, with patch
attached:

   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68848

   --dkg
Joseph Myers Dec. 10, 2015, 11:59 p.m. UTC | #2
On Thu, 10 Dec 2015, Daniel Kahn Gillmor wrote:

> Specifically, if the first character of the "old" argument is a
> literal $, then gcc will treat it as an environment variable name, and
> use the value of the env var for prefix mapping.

I don't think a literal $ in option arguments is a good idea; it's far too 
hard to pass through a sequence of shells and makefiles that you typically 
get in recursive make.  You end up with things like 
'-Wl,-rpath,'\''\\\$$\$$\\\$$\$$ORIGIN'\''/../' (part of a process for 
using $ORIGIN when linking GDB) if you try.
Daniel Kahn Gillmor Dec. 11, 2015, 12:12 a.m. UTC | #3
On Thu 2015-12-10 18:59:33 -0500, Joseph Myers wrote:
> On Thu, 10 Dec 2015, Daniel Kahn Gillmor wrote:
>
>> Specifically, if the first character of the "old" argument is a
>> literal $, then gcc will treat it as an environment variable name, and
>> use the value of the env var for prefix mapping.
>
> I don't think a literal $ in option arguments is a good idea; it's far too 
> hard to pass through a sequence of shells and makefiles that you typically 
> get in recursive make.  You end up with things like 
> '-Wl,-rpath,'\''\\\$$\$$\\\$$\$$ORIGIN'\''/../' (part of a process for 
> using $ORIGIN when linking GDB) if you try.

yow, that's truly monstrous!

Is there a different symbol or string you'd be OK using instead for the
same approach?  What about looking for an "ENV:" prefix?

so something like:

 -fdebug-prefix-map=ENV:SOURCE_BUILD_DIR=/usr/src

wdyt?  I could rework the patch pretty easily if that seems acceptable.

        --dkg
Thomas Klausner Dec. 22, 2015, 12:15 p.m. UTC | #4
I just found a related patch again that we have in NetBSD's gcc that
allows fixing __FILE__ references.

See
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47047

No progress since 2010 in getting this included though :(
 Thomas

On Thu, Dec 10, 2015 at 12:36:18PM -0500, Daniel Kahn Gillmor wrote:
> Work on the reproducible-builds project [0] has identified that build
> paths are one cause of output variation between builds.  This
> changeset allows users to avoid this variation when building C objects
> with debug symbols, while leaving the default behavior unchanged.
> 
> Background
> ----------
> 
> gcc includes the build path in any generated DWARF debugging symbols,
> specifically in DW_AT_comp_dir, but allows the embedded path to be
> changed via -fdebug-prefix-map.
> 
> When -fdebug-prefix-map is used with the current build path, it
> removes the build path from DW_AT_comp_dir but places it instead in
> DW_AT_producer, so the reproducibility problem isn't resolved.
> 
> When building software for binary redistribution, the actual build
> path on the build machine is irrelevant, and doesn't need to be
> exposed in the debug symbols.
> 
> Resolution
> ----------
> 
> This patch extends the first argument to -fdebug-prefix-map ("old") to
> be able to read from the environment, which allows a packager to avoid
> embedded build paths in the debugging symbols with something like:
> 
>   export SOURCE_BUILD_DIR="$(pwd)"
>   gcc -fdebug-prefix-map=\$SOURCE_BUILD_DIR=/usr/src
> 
> Details
> -------
> 
> Specifically, if the first character of the "old" argument is a
> literal $, then gcc will treat it as an environment variable name, and
> use the value of the env var for prefix mapping.
> 
> As a result, DW_AT_producer contains the literal envvar name,
> DW_AT_comp_dir contains the transformed build path, and the actual
> build path is not at all present in the generated object file.
> 
> This has been tested successfully on amd64 machines, and i see no
> reason why it would be platform-specific.
> 
> More discussion of alternate approaches considered and discarded in
> the development of this change can be found at [1] for those
> interested.
> 
> Feedback welcome!
> 
> [0] https://reproducible-builds.org
> [1] https://lists.alioth.debian.org/pipermail/reproducible-builds/Week-of-Mon-20151130/004051.html
> ---
>  gcc/doc/invoke.texi |  4 +++-
>  gcc/final.c         | 27 +++++++++++++++++++++++++--
>  2 files changed, 28 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 5256031..234432f 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -6440,7 +6440,9 @@ link processing time.  Merging is enabled by default.
>  @item -fdebug-prefix-map=@var{old}=@var{new}
>  @opindex fdebug-prefix-map
>  When compiling files in directory @file{@var{old}}, record debugging
> -information describing them as in @file{@var{new}} instead.
> +information describing them as in @file{@var{new}} instead.  If
> +@file{@var{old}} starts with a @samp{$}, the corresponding environment
> +variable will be dereferenced, and its value will be used instead.
>  
>  @item -fno-dwarf2-cfi-asm
>  @opindex fdwarf2-cfi-asm
> diff --git a/gcc/final.c b/gcc/final.c
> index 8cb5533..bc43b61 100644
> --- a/gcc/final.c
> +++ b/gcc/final.c
> @@ -1525,6 +1525,9 @@ add_debug_prefix_map (const char *arg)
>  {
>    debug_prefix_map *map;
>    const char *p;
> +  char *env;
> +  const char *old;
> +  size_t oldlen;
>  
>    p = strchr (arg, '=');
>    if (!p)
> @@ -1532,9 +1535,29 @@ add_debug_prefix_map (const char *arg)
>        error ("invalid argument %qs to -fdebug-prefix-map", arg);
>        return;
>      }
> +  if (*arg == '$')
> +    {
> +      env = xstrndup (arg+1, p - (arg+1));
> +      old = getenv(env);
> +      if (!old)
> +	{
> +	  warning (0, "environment variable %qs not set in argument to "
> +		   "-fdebug-prefix-map", env);
> +	  free(env);
> +	  return;
> +	}
> +      oldlen = strlen(old);
> +      free(env);
> +    }
> +  else
> +    {
> +      old = xstrndup (arg, p - arg);
> +      oldlen = p - arg;
> +    }
> +
>    map = XNEW (debug_prefix_map);
> -  map->old_prefix = xstrndup (arg, p - arg);
> -  map->old_len = p - arg;
> +  map->old_prefix = old;
> +  map->old_len = oldlen;
>    p++;
>    map->new_prefix = xstrdup (p);
>    map->new_len = strlen (p);
> -- 
> 2.6.2
> 
> _______________________________________________
> rb-general@lists.reproducible-builds.org mailing list
> 
> To unsubscribe or change your options, please visit:
> https://lists.reproducible-builds.org/listinfo/rb-general
>
diff mbox

Patch

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 5256031..234432f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -6440,7 +6440,9 @@  link processing time.  Merging is enabled by default.
 @item -fdebug-prefix-map=@var{old}=@var{new}
 @opindex fdebug-prefix-map
 When compiling files in directory @file{@var{old}}, record debugging
-information describing them as in @file{@var{new}} instead.
+information describing them as in @file{@var{new}} instead.  If
+@file{@var{old}} starts with a @samp{$}, the corresponding environment
+variable will be dereferenced, and its value will be used instead.
 
 @item -fno-dwarf2-cfi-asm
 @opindex fdwarf2-cfi-asm
diff --git a/gcc/final.c b/gcc/final.c
index 8cb5533..bc43b61 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -1525,6 +1525,9 @@  add_debug_prefix_map (const char *arg)
 {
   debug_prefix_map *map;
   const char *p;
+  char *env;
+  const char *old;
+  size_t oldlen;
 
   p = strchr (arg, '=');
   if (!p)
@@ -1532,9 +1535,29 @@  add_debug_prefix_map (const char *arg)
       error ("invalid argument %qs to -fdebug-prefix-map", arg);
       return;
     }
+  if (*arg == '$')
+    {
+      env = xstrndup (arg+1, p - (arg+1));
+      old = getenv(env);
+      if (!old)
+	{
+	  warning (0, "environment variable %qs not set in argument to "
+		   "-fdebug-prefix-map", env);
+	  free(env);
+	  return;
+	}
+      oldlen = strlen(old);
+      free(env);
+    }
+  else
+    {
+      old = xstrndup (arg, p - arg);
+      oldlen = p - arg;
+    }
+
   map = XNEW (debug_prefix_map);
-  map->old_prefix = xstrndup (arg, p - arg);
-  map->old_len = p - arg;
+  map->old_prefix = old;
+  map->old_len = oldlen;
   p++;
   map->new_prefix = xstrdup (p);
   map->new_len = strlen (p);