diff mbox series

mtrace: Fix output with PIE and ASLR [BZ #22716]

Message ID 20210722132457.1945231-1-siddhesh@sourceware.org
State New
Headers show
Series mtrace: Fix output with PIE and ASLR [BZ #22716] | expand

Commit Message

Siddhesh Poyarekar July 22, 2021, 1:24 p.m. UTC
Record only the relative address of the caller in mtrace file.  Use
LD_TRACE_PRELINKING to get the executable as well as binary vs
executable load offsets so that we may compute a base to add to the
relative address in the mtrace file.  This allows us to get a valid
address to pass to addr2line in all cases.

Co-authored-by:  John Ogness <john.ogness@linutronix.de>
---
 malloc/mtrace-impl.c |  4 ++--
 malloc/mtrace.pl     | 12 ++++--------
 2 files changed, 6 insertions(+), 10 deletions(-)

Comments

John Ogness July 22, 2021, 2:12 p.m. UTC | #1
Hi Siddhesh,

Thanks for pushing this much needed fix. Some comments from me below.

On 2021-07-22, Siddhesh Poyarekar <siddhesh@sourceware.org> wrote:
> Record only the relative address of the caller in mtrace file.  Use
> LD_TRACE_PRELINKING to get the executable as well as binary vs
> executable load offsets so that we may compute a base to add to the
> relative address in the mtrace file.  This allows us to get a valid
> address to pass to addr2line in all cases.
>
> Co-authored-by:  John Ogness <john.ogness@linutronix.de>
> ---
>  malloc/mtrace-impl.c |  4 ++--
>  malloc/mtrace.pl     | 12 ++++--------
>  2 files changed, 6 insertions(+), 10 deletions(-)
>
> diff --git a/malloc/mtrace.pl b/malloc/mtrace.pl
> index 6f49c8338d..f2570d2186 100644
> --- a/malloc/mtrace.pl
> +++ b/malloc/mtrace.pl
> @@ -75,11 +75,12 @@ if ($#ARGV == 0) {
>      } else {
>  	$prog = "./$binary";
>      }
> -    if (open (LOCS, "env LD_TRACE_LOADED_OBJECTS=1 $prog |")) {
> +    if (open (LOCS, "env LD_TRACE_PRELINKING=1 $prog |")) {
>  	while (<LOCS>) {
>  	    chop;
> -	    if (/^.*=> (.*) .(0x[0123456789abcdef]*).$/) {
> +	    if (/^.*=> (.*) \((0x[0123456789abcdef]*), (0x[0123456789abcdef]*).*/) {

AFAIK you are only interested in @l_map_start of the link map. That is
the first argument.

>  		$locs{$1} = $2;
> +		$rel{$1} = hex($2) - hex($3);

Subtracting @l_addr from @l_map_start will probably always result in
0. Why should @l_addr be interesting for mtrace?

I recommend:

+	    if (/^.*=> (.*) .(0x[0123456789abcdef]*),.*/) {
 		$locs{$1} = $2;
+		$rel{$1} = hex($2);

>  	    }
>  	}
>  	close (LOCS);
> @@ -110,12 +111,7 @@ sub location {
>  	my $addr = $2;
>  	my $searchaddr;
>  	return $cache{$addr} if (exists $cache{$addr});
> -	if ($locs{$prog} ne "") {
> -	    $searchaddr = sprintf "%#x", $addr - $locs{$prog};
> -	} else {
> -	    $searchaddr = $addr;
> -	    $prog = $binary;
> -	}
> +	$searchaddr = sprintf "%#x", hex($addr) + $rel{$prog};

And then $rel would need to be subtracted, not added:

+	$searchaddr = sprintf "%#x", hex($addr) - $rel{$prog};

John Ogness
Siddhesh Poyarekar July 22, 2021, 2:31 p.m. UTC | #2
On 7/22/21 7:42 PM, John Ogness wrote:
> AFAIK you are only interested in @l_map_start of the link map. That is
> the first argument.
> 
>>   		$locs{$1} = $2;
>> +		$rel{$1} = hex($2) - hex($3);
> 
> Subtracting @l_addr from @l_map_start will probably always result in
> 0. Why should @l_addr be interesting for mtrace?
> 

l_addr is the difference between the ELF file and that in memory.  For 
PIC/PIE, the base address in the ELF is set at 0, so l_addr and 
l_map_start is the same.

However for non-pie binaries, the base address in the ELF is non-zero, 
because of which you'll see the same value in l_map_start as in the ELF, 
which makes l_addr 0.

So the (l_addr - l_map_start) is 0 for PIC and PIE, but not for non-pie 
executables.  Adding that relative offset in the binary for addr2line 
will give you the correct symbol in both pic/pie and non-pie cases.

Siddhesh
John Ogness July 22, 2021, 2:45 p.m. UTC | #3
On 2021-07-22, Siddhesh Poyarekar <siddhesh@sourceware.org> wrote:
> On 7/22/21 7:42 PM, John Ogness wrote:
>> AFAIK you are only interested in @l_map_start of the link map. That is
>> the first argument.
>> 
>>>   		$locs{$1} = $2;
>>> +		$rel{$1} = hex($2) - hex($3);
>> 
>> Subtracting @l_addr from @l_map_start will probably always result in
>> 0. Why should @l_addr be interesting for mtrace?
>
> l_addr is the difference between the ELF file and that in memory.  For 
> PIC/PIE, the base address in the ELF is set at 0, so l_addr and 
> l_map_start is the same.
>
> However for non-pie binaries, the base address in the ELF is non-zero, 
> because of which you'll see the same value in l_map_start as in the ELF, 
> which makes l_addr 0.

Thank you for the clarification. I forgot about non-pie. But then it
means we are always interested in l_addr. l_map_start is irrelevant.

> So the (l_addr - l_map_start) is 0 for PIC and PIE, but not for non-pie 
> executables.  Adding that relative offset in the binary for addr2line 
> will give you the correct symbol in both pic/pie and non-pie cases.

OK, but settings MALLOC_TRACE will cause the memory addresses to be
stored. So it is l_addr that must be subtracted for addr2line.

John Ogness
Siddhesh Poyarekar July 22, 2021, 3:17 p.m. UTC | #4
On 7/22/21 8:15 PM, John Ogness wrote:
> Thank you for the clarification. I forgot about non-pie. But then it
> means we are always interested in l_addr. l_map_start is irrelevant.

Not really.  The end result we want is an address that points to the 
correct symbol in the ELF binary.  We may think of that address as the 
ELF base address and the offset of the symbol from the base because 
regardless of whether the object is relocatable or not, the offset of 
the symbol will be constant from the base.

We get the offset of the symbol from base in the MALLOC_TRACE run; I 
have modified the output to be the offset instead of the actual address 
because for relocatable objects the actual address is practically 
useless across multiple runs.

In the second run (i.e. prelinking trace), we want to compute the ELF 
base address.  We have l_map_start (which is the actual base address in 
memory) and l_addr (which is l_map_start - ELF base address).  So the 
ELF base address is clearly l_map_start - l_addr.

Finally we get the symbol address as ELF base + offset recorded in the 
trace.

>> So the (l_addr - l_map_start) is 0 for PIC and PIE, but not for non-pie
>> executables.  Adding that relative offset in the binary for addr2line
>> will give you the correct symbol in both pic/pie and non-pie cases.
> 
> OK, but settings MALLOC_TRACE will cause the memory addresses to be
> stored. So it is l_addr that must be subtracted for addr2line.

My patch updates the dump to be that of the offset from base and not the 
actual address, which is why all of this works.

Siddhesh
Siddhesh Poyarekar July 22, 2021, 3:19 p.m. UTC | #5
On 7/22/21 8:15 PM, John Ogness via Libc-alpha wrote:
> OK, but settings MALLOC_TRACE will cause the memory addresses to be
> stored. So it is l_addr that must be subtracted for addr2line.

I forgot to mention that you cannot subtract across two different runs 
like that because they will have different base addresses due to ASLR.

Siddhesh
Siddhesh Poyarekar Aug. 11, 2021, 7:41 a.m. UTC | #6
Ping!

On 7/22/21 6:54 PM, Siddhesh Poyarekar via Libc-alpha wrote:
> Record only the relative address of the caller in mtrace file.  Use
> LD_TRACE_PRELINKING to get the executable as well as binary vs
> executable load offsets so that we may compute a base to add to the
> relative address in the mtrace file.  This allows us to get a valid
> address to pass to addr2line in all cases.
> 
> Co-authored-by:  John Ogness <john.ogness@linutronix.de>
> ---
>   malloc/mtrace-impl.c |  4 ++--
>   malloc/mtrace.pl     | 12 ++++--------
>   2 files changed, 6 insertions(+), 10 deletions(-)
> 
> diff --git a/malloc/mtrace-impl.c b/malloc/mtrace-impl.c
> index 83008ca18f..3a65863d24 100644
> --- a/malloc/mtrace-impl.c
> +++ b/malloc/mtrace-impl.c
> @@ -65,9 +65,9 @@ tr_where (const void *caller, Dl_info *info)
>   		       offset);
>               }
>   
> -	  fprintf (mallstream, "@ %s%s%s[%p] ", info->dli_fname ? : "",
> +	  fprintf (mallstream, "@ %s%s%s[0x%" PRIxPTR "] ", info->dli_fname ? : "",
>   		   info->dli_fname ? ":" : "",
> -                   buf, caller);
> +                   buf, caller - info->dli_fbase);
>           }
>         else
>           fprintf (mallstream, "@ [%p] ", caller);
> diff --git a/malloc/mtrace.pl b/malloc/mtrace.pl
> index 6f49c8338d..f2570d2186 100644
> --- a/malloc/mtrace.pl
> +++ b/malloc/mtrace.pl
> @@ -75,11 +75,12 @@ if ($#ARGV == 0) {
>       } else {
>   	$prog = "./$binary";
>       }
> -    if (open (LOCS, "env LD_TRACE_LOADED_OBJECTS=1 $prog |")) {
> +    if (open (LOCS, "env LD_TRACE_PRELINKING=1 $prog |")) {
>   	while (<LOCS>) {
>   	    chop;
> -	    if (/^.*=> (.*) .(0x[0123456789abcdef]*).$/) {
> +	    if (/^.*=> (.*) \((0x[0123456789abcdef]*), (0x[0123456789abcdef]*).*/) {
>   		$locs{$1} = $2;
> +		$rel{$1} = hex($2) - hex($3);
>   	    }
>   	}
>   	close (LOCS);
> @@ -110,12 +111,7 @@ sub location {
>   	my $addr = $2;
>   	my $searchaddr;
>   	return $cache{$addr} if (exists $cache{$addr});
> -	if ($locs{$prog} ne "") {
> -	    $searchaddr = sprintf "%#x", $addr - $locs{$prog};
> -	} else {
> -	    $searchaddr = $addr;
> -	    $prog = $binary;
> -	}
> +	$searchaddr = sprintf "%#x", hex($addr) + $rel{$prog};
>   	if ($binary ne "" && open (ADDR, "addr2line -e $prog $searchaddr|")) {
>   	    my $line = <ADDR>;
>   	    chomp $line;
>
DJ Delorie Aug. 12, 2021, 4:15 a.m. UTC | #7
Siddhesh Poyarekar via Libc-alpha <libc-alpha@sourceware.org> writes:

LGTM with one note...

> -    if (open (LOCS, "env LD_TRACE_LOADED_OBJECTS=1 $prog |")) {
> +    if (open (LOCS, "env LD_TRACE_PRELINKING=1 $prog |")) {

Note that LD_TRACE_PRELINKING specifies a list of objects; setting it to
"1" works as long as none of the objects are named "1".  I would think a
suitable comment would at least prevent spreading misunderstandings
about that variable.

Reviewed-by: DJ Delorie <dj@redhat.com>
Siddhesh Poyarekar Aug. 12, 2021, 4:33 a.m. UTC | #8
On 8/12/21 9:45 AM, DJ Delorie via Libc-alpha wrote:
> Siddhesh Poyarekar via Libc-alpha <libc-alpha@sourceware.org> writes:
> 
> LGTM with one note...
> 
>> -    if (open (LOCS, "env LD_TRACE_LOADED_OBJECTS=1 $prog |")) {
>> +    if (open (LOCS, "env LD_TRACE_PRELINKING=1 $prog |")) {
> 
> Note that LD_TRACE_PRELINKING specifies a list of objects; setting it to
> "1" works as long as none of the objects are named "1".  I would think a
> suitable comment would at least prevent spreading misunderstandings
> about that variable.

It looks like setting it to empty, i.e.

     env LD_TRACE_PRELINKING= $prog

also works just as well.  I'll send v2 with that.

Thanks,
Siddhesh
DJ Delorie Aug. 12, 2021, 4:35 a.m. UTC | #9
Siddhesh Poyarekar <siddhesh@sourceware.org> writes:
> It looks like setting it to empty, i.e.
>
>      env LD_TRACE_PRELINKING= $prog
>
> also works just as well.  I'll send v2 with that.

That even more so begs for a comment explaining it ;-)
diff mbox series

Patch

diff --git a/malloc/mtrace-impl.c b/malloc/mtrace-impl.c
index 83008ca18f..3a65863d24 100644
--- a/malloc/mtrace-impl.c
+++ b/malloc/mtrace-impl.c
@@ -65,9 +65,9 @@  tr_where (const void *caller, Dl_info *info)
 		       offset);
             }
 
-	  fprintf (mallstream, "@ %s%s%s[%p] ", info->dli_fname ? : "",
+	  fprintf (mallstream, "@ %s%s%s[0x%" PRIxPTR "] ", info->dli_fname ? : "",
 		   info->dli_fname ? ":" : "",
-                   buf, caller);
+                   buf, caller - info->dli_fbase);
         }
       else
         fprintf (mallstream, "@ [%p] ", caller);
diff --git a/malloc/mtrace.pl b/malloc/mtrace.pl
index 6f49c8338d..f2570d2186 100644
--- a/malloc/mtrace.pl
+++ b/malloc/mtrace.pl
@@ -75,11 +75,12 @@  if ($#ARGV == 0) {
     } else {
 	$prog = "./$binary";
     }
-    if (open (LOCS, "env LD_TRACE_LOADED_OBJECTS=1 $prog |")) {
+    if (open (LOCS, "env LD_TRACE_PRELINKING=1 $prog |")) {
 	while (<LOCS>) {
 	    chop;
-	    if (/^.*=> (.*) .(0x[0123456789abcdef]*).$/) {
+	    if (/^.*=> (.*) \((0x[0123456789abcdef]*), (0x[0123456789abcdef]*).*/) {
 		$locs{$1} = $2;
+		$rel{$1} = hex($2) - hex($3);
 	    }
 	}
 	close (LOCS);
@@ -110,12 +111,7 @@  sub location {
 	my $addr = $2;
 	my $searchaddr;
 	return $cache{$addr} if (exists $cache{$addr});
-	if ($locs{$prog} ne "") {
-	    $searchaddr = sprintf "%#x", $addr - $locs{$prog};
-	} else {
-	    $searchaddr = $addr;
-	    $prog = $binary;
-	}
+	$searchaddr = sprintf "%#x", hex($addr) + $rel{$prog};
 	if ($binary ne "" && open (ADDR, "addr2line -e $prog $searchaddr|")) {
 	    my $line = <ADDR>;
 	    chomp $line;