Patchwork elf: Calculate symbol size if needed

login
register
mail settings
Submitter Stefan Weil
Date Aug. 9, 2010, 2:43 p.m.
Message ID <1281365033-6893-1-git-send-email-weil@mail.berlios.de>
Download mbox | patch
Permalink /patch/61282/
State New
Headers show

Comments

Stefan Weil - Aug. 9, 2010, 2:43 p.m.
Symbols with a size of 0 are unusable for the disassembler.

Example:

While running an arm linux kernel, no symbolic names are
used in qemu.log when the cpu is executing an assembler function.

Assume that the size of such symbols is the difference to the
next symbol value.

Signed-off-by: Stefan Weil <weil@mail.berlios.de>
---
 hw/elf_ops.h |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)
Blue Swirl - Aug. 11, 2010, 4:21 p.m.
On Mon, Aug 9, 2010 at 2:43 PM, Stefan Weil <weil@mail.berlios.de> wrote:
> Symbols with a size of 0 are unusable for the disassembler.
>
> Example:
>
> While running an arm linux kernel, no symbolic names are
> used in qemu.log when the cpu is executing an assembler function.

That is a problem of the assembler function, it should use '.size'
directive like what happens when C code is compiled. And why just ARM?

> Assume that the size of such symbols is the difference to the
> next symbol value.
>
> Signed-off-by: Stefan Weil <weil@mail.berlios.de>
> ---
>  hw/elf_ops.h |    5 +++++
>  1 files changed, 5 insertions(+), 0 deletions(-)
>
> diff --git a/hw/elf_ops.h b/hw/elf_ops.h
> index 27d1ab9..0bd7235 100644
> --- a/hw/elf_ops.h
> +++ b/hw/elf_ops.h
> @@ -153,6 +153,11 @@ static int glue(load_symbols, SZ)(struct elfhdr *ehdr, int fd, int must_swab,
>         syms = qemu_realloc(syms, nsyms * sizeof(*syms));
>
>         qsort(syms, nsyms, sizeof(*syms), glue(symcmp, SZ));
> +        for (i = 0; i < nsyms - 1; i++) {
> +            if (syms[i].st_size == 0) {
> +                syms[i].st_size = syms[i + 1].st_value - syms[i].st_value;
> +            }
> +        }

The size of the last symbol is not guesstimated, it could be assumed
to be _etext - syms[nsyms].st_value.

>     } else {
>         qemu_free(syms);
>         syms = NULL;
> --
> 1.7.1
>
>
>
Stefan Weil - Aug. 11, 2010, 6:03 p.m.
Am 11.08.2010 18:21, schrieb Blue Swirl:
> On Mon, Aug 9, 2010 at 2:43 PM, Stefan Weil<weil@mail.berlios.de>  wrote:
>    
>> Symbols with a size of 0 are unusable for the disassembler.
>>
>> Example:
>>
>> While running an arm linux kernel, no symbolic names are
>> used in qemu.log when the cpu is executing an assembler function.
>>      
> That is a problem of the assembler function, it should use '.size'
> directive like what happens when C code is compiled. And why just ARM?
>    

It's not just ARM. ARM is just an example.

But I stumbled upon this problem when running the linux
start code from arch/arm/kernel/head.S.

>> Assume that the size of such symbols is the difference to the
>> next symbol value.
>>
>> Signed-off-by: Stefan Weil<weil@mail.berlios.de>
>> ---
>>   hw/elf_ops.h |    5 +++++
>>   1 files changed, 5 insertions(+), 0 deletions(-)
>>
>> diff --git a/hw/elf_ops.h b/hw/elf_ops.h
>> index 27d1ab9..0bd7235 100644
>> --- a/hw/elf_ops.h
>> +++ b/hw/elf_ops.h
>> @@ -153,6 +153,11 @@ static int glue(load_symbols, SZ)(struct elfhdr *ehdr, int fd, int must_swab,
>>          syms = qemu_realloc(syms, nsyms * sizeof(*syms));
>>
>>          qsort(syms, nsyms, sizeof(*syms), glue(symcmp, SZ));
>> +        for (i = 0; i<  nsyms - 1; i++) {
>> +            if (syms[i].st_size == 0) {
>> +                syms[i].st_size = syms[i + 1].st_value - syms[i].st_value;
>> +            }
>> +        }
>>      
> The size of the last symbol is not guesstimated, it could be assumed
> to be _etext - syms[nsyms].st_value.
>    

Or better
syms[nsyms - 1].st_size = _etext - syms[nsyms - 1].st_value

Even that would be wrong if the last symbol is not in the
text segment but data.

Programming that special case just to get perhaps one
last symbol size seems too much perfectionism.

Most symbols have a size != 0, so let's hope the last symbol
has one, too :-)
Stefan Weil - Sept. 9, 2010, 5:42 p.m.
Am 11.08.2010 18:21, schrieb Blue Swirl:
> On Mon, Aug 9, 2010 at 2:43 PM, Stefan Weil<weil@mail.berlios.de>  wrote:
>    
>> Symbols with a size of 0 are unusable for the disassembler.
>>
>> Example:
>>
>> While running an arm linux kernel, no symbolic names are
>> used in qemu.log when the cpu is executing an assembler function.
>>      
> That is a problem of the assembler function, it should use '.size'
> directive like what happens when C code is compiled. And why just ARM?
>
>    
>> Assume that the size of such symbols is the difference to the
>> next symbol value.
>>
>> Signed-off-by: Stefan Weil<weil@mail.berlios.de>
>> ---
>>   hw/elf_ops.h |    5 +++++
>>   1 files changed, 5 insertions(+), 0 deletions(-)
>>
>> diff --git a/hw/elf_ops.h b/hw/elf_ops.h
>> index 27d1ab9..0bd7235 100644
>> --- a/hw/elf_ops.h
>> +++ b/hw/elf_ops.h
>> @@ -153,6 +153,11 @@ static int glue(load_symbols, SZ)(struct elfhdr *ehdr, int fd, int must_swab,
>>          syms = qemu_realloc(syms, nsyms * sizeof(*syms));
>>
>>          qsort(syms, nsyms, sizeof(*syms), glue(symcmp, SZ));
>> +        for (i = 0; i<  nsyms - 1; i++) {
>> +            if (syms[i].st_size == 0) {
>> +                syms[i].st_size = syms[i + 1].st_value - syms[i].st_value;
>> +            }
>> +        }
>>      
> The size of the last symbol is not guesstimated, it could be assumed
> to be _etext - syms[nsyms].st_value.
>
>    
>>      } else {
>>          qemu_free(syms);
>>          syms = NULL;
>> --
>> 1.7.1
>


The patch is still missing in qemu master.
 From the two feedbacks I did not read that anything needs to be changed.
Was I wrong, or can it be applied?
Blue Swirl - Sept. 9, 2010, 6:44 p.m.
On Thu, Sep 9, 2010 at 5:42 PM, Stefan Weil <weil@mail.berlios.de> wrote:
> Am 11.08.2010 18:21, schrieb Blue Swirl:
>>
>> On Mon, Aug 9, 2010 at 2:43 PM, Stefan Weil<weil@mail.berlios.de>  wrote:
>>
>>>
>>> Symbols with a size of 0 are unusable for the disassembler.
>>>
>>> Example:
>>>
>>> While running an arm linux kernel, no symbolic names are
>>> used in qemu.log when the cpu is executing an assembler function.
>>>
>>
>> That is a problem of the assembler function, it should use '.size'
>> directive like what happens when C code is compiled. And why just ARM?
>>
>>
>>>
>>> Assume that the size of such symbols is the difference to the
>>> next symbol value.
>>>
>>> Signed-off-by: Stefan Weil<weil@mail.berlios.de>
>>> ---
>>>  hw/elf_ops.h |    5 +++++
>>>  1 files changed, 5 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/hw/elf_ops.h b/hw/elf_ops.h
>>> index 27d1ab9..0bd7235 100644
>>> --- a/hw/elf_ops.h
>>> +++ b/hw/elf_ops.h
>>> @@ -153,6 +153,11 @@ static int glue(load_symbols, SZ)(struct elfhdr
>>> *ehdr, int fd, int must_swab,
>>>         syms = qemu_realloc(syms, nsyms * sizeof(*syms));
>>>
>>>         qsort(syms, nsyms, sizeof(*syms), glue(symcmp, SZ));
>>> +        for (i = 0; i<  nsyms - 1; i++) {
>>> +            if (syms[i].st_size == 0) {
>>> +                syms[i].st_size = syms[i + 1].st_value -
>>> syms[i].st_value;
>>> +            }
>>> +        }
>>>
>>
>> The size of the last symbol is not guesstimated, it could be assumed
>> to be _etext - syms[nsyms].st_value.
>>
>>
>>>
>>>     } else {
>>>         qemu_free(syms);
>>>         syms = NULL;
>>> --
>>> 1.7.1
>>
>
>
> The patch is still missing in qemu master.
> From the two feedbacks I did not read that anything needs to be changed.
> Was I wrong, or can it be applied?

Please fix the last symbol. Either we should fix all symbols or none,
half fixed (OK, practically all) is not so great.
Stefan Weil - Sept. 9, 2010, 7:11 p.m.
Am 09.09.2010 20:44, schrieb Blue Swirl:
> On Thu, Sep 9, 2010 at 5:42 PM, Stefan Weil <weil@mail.berlios.de> wrote:
>> Am 11.08.2010 18:21, schrieb Blue Swirl:
>>>
>>> On Mon, Aug 9, 2010 at 2:43 PM, Stefan Weil<weil@mail.berlios.de> 
>>>  wrote:
>>>
>>>>
>>>> Symbols with a size of 0 are unusable for the disassembler.
>>>>
>>>> Example:
>>>>
>>>> While running an arm linux kernel, no symbolic names are
>>>> used in qemu.log when the cpu is executing an assembler function.
>>>>
>>>
>>> That is a problem of the assembler function, it should use '.size'
>>> directive like what happens when C code is compiled. And why just ARM?
>>>
>>>
>>>>
>>>> Assume that the size of such symbols is the difference to the
>>>> next symbol value.
>>>>
>>>> Signed-off-by: Stefan Weil<weil@mail.berlios.de>
>>>> ---
>>>>  hw/elf_ops.h |    5 +++++
>>>>  1 files changed, 5 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/hw/elf_ops.h b/hw/elf_ops.h
>>>> index 27d1ab9..0bd7235 100644
>>>> --- a/hw/elf_ops.h
>>>> +++ b/hw/elf_ops.h
>>>> @@ -153,6 +153,11 @@ static int glue(load_symbols, SZ)(struct elfhdr
>>>> *ehdr, int fd, int must_swab,
>>>>         syms = qemu_realloc(syms, nsyms * sizeof(*syms));
>>>>
>>>>         qsort(syms, nsyms, sizeof(*syms), glue(symcmp, SZ));
>>>> +        for (i = 0; i<  nsyms - 1; i++) {
>>>> +            if (syms[i].st_size == 0) {
>>>> +                syms[i].st_size = syms[i + 1].st_value -
>>>> syms[i].st_value;
>>>> +            }
>>>> +        }
>>>>
>>>
>>> The size of the last symbol is not guesstimated, it could be assumed
>>> to be _etext - syms[nsyms].st_value.
>>>
>>>
>>>>
>>>>     } else {
>>>>         qemu_free(syms);
>>>>         syms = NULL;
>>>> --
>>>> 1.7.1
>>>
>>
>>
>> The patch is still missing in qemu master.
>> From the two feedbacks I did not read that anything needs to be changed.
>> Was I wrong, or can it be applied?
>
> Please fix the last symbol. Either we should fix all symbols or none,
> half fixed (OK, practically all) is not so great.


The last symbol is one of several thousands, and most symbols don't need 
a fix,
so with my fix more than 99.9 or even 99.99 percent of all symbols are 
ok :-)
If the last symbol happens to be wrong, there is still a high 
probability that
nobody will notice this because it is unused by QEMU. The problem I 
faced with
QEMU's disassembly came from symbols with an address followed by code.
Is there any code after the last symbol? I don't expect that. In a 
sorted list
of symbols from the text segment, _etext should be the last symbols!

I think that the small chance of a missing fix for the last symbol is in 
no relation
to the code needed.

Even worse, I have no simple formula to guess a valid value for the last 
symbol.
The formula you suggested (with the corrections I wrote in my reply) is 
only ok
if the last symbol is in the text segment. Usually there are also 
symbols for data
in other segments, and in many cases these segments are located after the
text segment. In these cases the last symbol is not located in the text 
segment
which makes guesses of its size much more complicated.

To make it short: I don't know how to fix the last symbol in a 
reasonable way.

Sorry,
Stefan
Blue Swirl - Sept. 9, 2010, 7:29 p.m.
On Thu, Sep 9, 2010 at 7:11 PM, Stefan Weil <weil@mail.berlios.de> wrote:
> Am 09.09.2010 20:44, schrieb Blue Swirl:
>>
>> On Thu, Sep 9, 2010 at 5:42 PM, Stefan Weil <weil@mail.berlios.de> wrote:
>>>
>>> Am 11.08.2010 18:21, schrieb Blue Swirl:
>>>>
>>>> On Mon, Aug 9, 2010 at 2:43 PM, Stefan Weil<weil@mail.berlios.de>
>>>>  wrote:
>>>>
>>>>>
>>>>> Symbols with a size of 0 are unusable for the disassembler.
>>>>>
>>>>> Example:
>>>>>
>>>>> While running an arm linux kernel, no symbolic names are
>>>>> used in qemu.log when the cpu is executing an assembler function.
>>>>>
>>>>
>>>> That is a problem of the assembler function, it should use '.size'
>>>> directive like what happens when C code is compiled. And why just ARM?
>>>>
>>>>
>>>>>
>>>>> Assume that the size of such symbols is the difference to the
>>>>> next symbol value.
>>>>>
>>>>> Signed-off-by: Stefan Weil<weil@mail.berlios.de>
>>>>> ---
>>>>>  hw/elf_ops.h |    5 +++++
>>>>>  1 files changed, 5 insertions(+), 0 deletions(-)
>>>>>
>>>>> diff --git a/hw/elf_ops.h b/hw/elf_ops.h
>>>>> index 27d1ab9..0bd7235 100644
>>>>> --- a/hw/elf_ops.h
>>>>> +++ b/hw/elf_ops.h
>>>>> @@ -153,6 +153,11 @@ static int glue(load_symbols, SZ)(struct elfhdr
>>>>> *ehdr, int fd, int must_swab,
>>>>>        syms = qemu_realloc(syms, nsyms * sizeof(*syms));
>>>>>
>>>>>        qsort(syms, nsyms, sizeof(*syms), glue(symcmp, SZ));
>>>>> +        for (i = 0; i<  nsyms - 1; i++) {
>>>>> +            if (syms[i].st_size == 0) {
>>>>> +                syms[i].st_size = syms[i + 1].st_value -
>>>>> syms[i].st_value;
>>>>> +            }
>>>>> +        }
>>>>>
>>>>
>>>> The size of the last symbol is not guesstimated, it could be assumed
>>>> to be _etext - syms[nsyms].st_value.
>>>>
>>>>
>>>>>
>>>>>    } else {
>>>>>        qemu_free(syms);
>>>>>        syms = NULL;
>>>>> --
>>>>> 1.7.1
>>>>
>>>
>>>
>>> The patch is still missing in qemu master.
>>> From the two feedbacks I did not read that anything needs to be changed.
>>> Was I wrong, or can it be applied?
>>
>> Please fix the last symbol. Either we should fix all symbols or none,
>> half fixed (OK, practically all) is not so great.
>
>
> The last symbol is one of several thousands, and most symbols don't need a
> fix,
> so with my fix more than 99.9 or even 99.99 percent of all symbols are ok
> :-)
> If the last symbol happens to be wrong, there is still a high probability
> that
> nobody will notice this because it is unused by QEMU. The problem I faced
> with
> QEMU's disassembly came from symbols with an address followed by code.
> Is there any code after the last symbol? I don't expect that. In a sorted
> list
> of symbols from the text segment, _etext should be the last symbols!
>
> I think that the small chance of a missing fix for the last symbol is in no
> relation
> to the code needed.
>
> Even worse, I have no simple formula to guess a valid value for the last
> symbol.
> The formula you suggested (with the corrections I wrote in my reply) is only
> ok
> if the last symbol is in the text segment. Usually there are also symbols
> for data
> in other segments, and in many cases these segments are located after the
> text segment. In these cases the last symbol is not located in the text
> segment
> which makes guesses of its size much more complicated.

How about using _end then?
Stefan Weil - Sept. 9, 2010, 7:34 p.m.
Am 09.09.2010 21:29, schrieb Blue Swirl:
> On Thu, Sep 9, 2010 at 7:11 PM, Stefan Weil<weil@mail.berlios.de>  wrote:
>    
>> Am 09.09.2010 20:44, schrieb Blue Swirl:
>>      
>>> On Thu, Sep 9, 2010 at 5:42 PM, Stefan Weil<weil@mail.berlios.de>  wrote:
>>>        
>>>> Am 11.08.2010 18:21, schrieb Blue Swirl:
>>>>          
>>>>> On Mon, Aug 9, 2010 at 2:43 PM, Stefan Weil<weil@mail.berlios.de>
>>>>>   wrote:
>>>>>
>>>>>            
>>>>>> Symbols with a size of 0 are unusable for the disassembler.
>>>>>>
>>>>>> Example:
>>>>>>
>>>>>> While running an arm linux kernel, no symbolic names are
>>>>>> used in qemu.log when the cpu is executing an assembler function.
>>>>>>
>>>>>>              
>>>>> That is a problem of the assembler function, it should use '.size'
>>>>> directive like what happens when C code is compiled. And why just ARM?
>>>>>
>>>>>
>>>>>            
>>>>>> Assume that the size of such symbols is the difference to the
>>>>>> next symbol value.
>>>>>>
>>>>>> Signed-off-by: Stefan Weil<weil@mail.berlios.de>
>>>>>> ---
>>>>>>   hw/elf_ops.h |    5 +++++
>>>>>>   1 files changed, 5 insertions(+), 0 deletions(-)
>>>>>>
>>>>>> diff --git a/hw/elf_ops.h b/hw/elf_ops.h
>>>>>> index 27d1ab9..0bd7235 100644
>>>>>> --- a/hw/elf_ops.h
>>>>>> +++ b/hw/elf_ops.h
>>>>>> @@ -153,6 +153,11 @@ static int glue(load_symbols, SZ)(struct elfhdr
>>>>>> *ehdr, int fd, int must_swab,
>>>>>>         syms = qemu_realloc(syms, nsyms * sizeof(*syms));
>>>>>>
>>>>>>         qsort(syms, nsyms, sizeof(*syms), glue(symcmp, SZ));
>>>>>> +        for (i = 0; i<    nsyms - 1; i++) {
>>>>>> +            if (syms[i].st_size == 0) {
>>>>>> +                syms[i].st_size = syms[i + 1].st_value -
>>>>>> syms[i].st_value;
>>>>>> +            }
>>>>>> +        }
>>>>>>
>>>>>>              
>>>>> The size of the last symbol is not guesstimated, it could be assumed
>>>>> to be _etext - syms[nsyms].st_value.
>>>>>
>>>>>
>>>>>            
>>>>>>     } else {
>>>>>>         qemu_free(syms);
>>>>>>         syms = NULL;
>>>>>> --
>>>>>> 1.7.1
>>>>>>              
>>>>>            
>>>>
>>>> The patch is still missing in qemu master.
>>>>  From the two feedbacks I did not read that anything needs to be changed.
>>>> Was I wrong, or can it be applied?
>>>>          
>>> Please fix the last symbol. Either we should fix all symbols or none,
>>> half fixed (OK, practically all) is not so great.
>>>        
>>
>> The last symbol is one of several thousands, and most symbols don't need a
>> fix,
>> so with my fix more than 99.9 or even 99.99 percent of all symbols are ok
>> :-)
>> If the last symbol happens to be wrong, there is still a high probability
>> that
>> nobody will notice this because it is unused by QEMU. The problem I faced
>> with
>> QEMU's disassembly came from symbols with an address followed by code.
>> Is there any code after the last symbol? I don't expect that. In a sorted
>> list
>> of symbols from the text segment, _etext should be the last symbols!
>>
>> I think that the small chance of a missing fix for the last symbol is in no
>> relation
>> to the code needed.
>>
>> Even worse, I have no simple formula to guess a valid value for the last
>> symbol.
>> The formula you suggested (with the corrections I wrote in my reply) is only
>> ok
>> if the last symbol is in the text segment. Usually there are also symbols
>> for data
>> in other segments, and in many cases these segments are located after the
>> text segment. In these cases the last symbol is not located in the text
>> segment
>> which makes guesses of its size much more complicated.
>>      
> How about using _end then?
>
>    

Wouldn't _end be the last symbol then?
Blue Swirl - Sept. 9, 2010, 7:36 p.m.
On Thu, Sep 9, 2010 at 7:34 PM, Stefan Weil <weil@mail.berlios.de> wrote:
> Am 09.09.2010 21:29, schrieb Blue Swirl:
>>
>> On Thu, Sep 9, 2010 at 7:11 PM, Stefan Weil<weil@mail.berlios.de>  wrote:
>>
>>>
>>> Am 09.09.2010 20:44, schrieb Blue Swirl:
>>>
>>>>
>>>> On Thu, Sep 9, 2010 at 5:42 PM, Stefan Weil<weil@mail.berlios.de>
>>>>  wrote:
>>>>
>>>>>
>>>>> Am 11.08.2010 18:21, schrieb Blue Swirl:
>>>>>
>>>>>>
>>>>>> On Mon, Aug 9, 2010 at 2:43 PM, Stefan Weil<weil@mail.berlios.de>
>>>>>>  wrote:
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Symbols with a size of 0 are unusable for the disassembler.
>>>>>>>
>>>>>>> Example:
>>>>>>>
>>>>>>> While running an arm linux kernel, no symbolic names are
>>>>>>> used in qemu.log when the cpu is executing an assembler function.
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> That is a problem of the assembler function, it should use '.size'
>>>>>> directive like what happens when C code is compiled. And why just ARM?
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Assume that the size of such symbols is the difference to the
>>>>>>> next symbol value.
>>>>>>>
>>>>>>> Signed-off-by: Stefan Weil<weil@mail.berlios.de>
>>>>>>> ---
>>>>>>>  hw/elf_ops.h |    5 +++++
>>>>>>>  1 files changed, 5 insertions(+), 0 deletions(-)
>>>>>>>
>>>>>>> diff --git a/hw/elf_ops.h b/hw/elf_ops.h
>>>>>>> index 27d1ab9..0bd7235 100644
>>>>>>> --- a/hw/elf_ops.h
>>>>>>> +++ b/hw/elf_ops.h
>>>>>>> @@ -153,6 +153,11 @@ static int glue(load_symbols, SZ)(struct elfhdr
>>>>>>> *ehdr, int fd, int must_swab,
>>>>>>>        syms = qemu_realloc(syms, nsyms * sizeof(*syms));
>>>>>>>
>>>>>>>        qsort(syms, nsyms, sizeof(*syms), glue(symcmp, SZ));
>>>>>>> +        for (i = 0; i<    nsyms - 1; i++) {
>>>>>>> +            if (syms[i].st_size == 0) {
>>>>>>> +                syms[i].st_size = syms[i + 1].st_value -
>>>>>>> syms[i].st_value;
>>>>>>> +            }
>>>>>>> +        }
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> The size of the last symbol is not guesstimated, it could be assumed
>>>>>> to be _etext - syms[nsyms].st_value.
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>    } else {
>>>>>>>        qemu_free(syms);
>>>>>>>        syms = NULL;
>>>>>>> --
>>>>>>> 1.7.1
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> The patch is still missing in qemu master.
>>>>>  From the two feedbacks I did not read that anything needs to be
>>>>> changed.
>>>>> Was I wrong, or can it be applied?
>>>>>
>>>>
>>>> Please fix the last symbol. Either we should fix all symbols or none,
>>>> half fixed (OK, practically all) is not so great.
>>>>
>>>
>>> The last symbol is one of several thousands, and most symbols don't need
>>> a
>>> fix,
>>> so with my fix more than 99.9 or even 99.99 percent of all symbols are ok
>>> :-)
>>> If the last symbol happens to be wrong, there is still a high probability
>>> that
>>> nobody will notice this because it is unused by QEMU. The problem I faced
>>> with
>>> QEMU's disassembly came from symbols with an address followed by code.
>>> Is there any code after the last symbol? I don't expect that. In a sorted
>>> list
>>> of symbols from the text segment, _etext should be the last symbols!
>>>
>>> I think that the small chance of a missing fix for the last symbol is in
>>> no
>>> relation
>>> to the code needed.
>>>
>>> Even worse, I have no simple formula to guess a valid value for the last
>>> symbol.
>>> The formula you suggested (with the corrections I wrote in my reply) is
>>> only
>>> ok
>>> if the last symbol is in the text segment. Usually there are also symbols
>>> for data
>>> in other segments, and in many cases these segments are located after the
>>> text segment. In these cases the last symbol is not located in the text
>>> segment
>>> which makes guesses of its size much more complicated.
>>>
>>
>> How about using _end then?
>>
>>
>
> Wouldn't _end be the last symbol then?

Right, _end should be the last one in any case. I'll apply the patch.
Edgar Iglesias - Sept. 9, 2010, 9:07 p.m.
On Thu, Sep 09, 2010 at 07:36:28PM +0000, Blue Swirl wrote:
> On Thu, Sep 9, 2010 at 7:34 PM, Stefan Weil <weil@mail.berlios.de> wrote:
> > Am 09.09.2010 21:29, schrieb Blue Swirl:
> >>
> >> On Thu, Sep 9, 2010 at 7:11 PM, Stefan Weil<weil@mail.berlios.de>  wrote:
> >>
> >>>
> >>> Am 09.09.2010 20:44, schrieb Blue Swirl:
> >>>
> >>>>
> >>>> On Thu, Sep 9, 2010 at 5:42 PM, Stefan Weil<weil@mail.berlios.de>
> >>>>  wrote:
> >>>>
> >>>>>
> >>>>> Am 11.08.2010 18:21, schrieb Blue Swirl:
> >>>>>
> >>>>>>
> >>>>>> On Mon, Aug 9, 2010 at 2:43 PM, Stefan Weil<weil@mail.berlios.de>
> >>>>>>  wrote:
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> Symbols with a size of 0 are unusable for the disassembler.
> >>>>>>>
> >>>>>>> Example:
> >>>>>>>
> >>>>>>> While running an arm linux kernel, no symbolic names are
> >>>>>>> used in qemu.log when the cpu is executing an assembler function.
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> That is a problem of the assembler function, it should use '.size'
> >>>>>> directive like what happens when C code is compiled. And why just ARM?
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> Assume that the size of such symbols is the difference to the
> >>>>>>> next symbol value.
> >>>>>>>
> >>>>>>> Signed-off-by: Stefan Weil<weil@mail.berlios.de>
> >>>>>>> ---
> >>>>>>>  hw/elf_ops.h |    5 +++++
> >>>>>>>  1 files changed, 5 insertions(+), 0 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/hw/elf_ops.h b/hw/elf_ops.h
> >>>>>>> index 27d1ab9..0bd7235 100644
> >>>>>>> --- a/hw/elf_ops.h
> >>>>>>> +++ b/hw/elf_ops.h
> >>>>>>> @@ -153,6 +153,11 @@ static int glue(load_symbols, SZ)(struct elfhdr
> >>>>>>> *ehdr, int fd, int must_swab,
> >>>>>>>        syms = qemu_realloc(syms, nsyms * sizeof(*syms));
> >>>>>>>
> >>>>>>>        qsort(syms, nsyms, sizeof(*syms), glue(symcmp, SZ));
> >>>>>>> +        for (i = 0; i<    nsyms - 1; i++) {
> >>>>>>> +            if (syms[i].st_size == 0) {
> >>>>>>> +                syms[i].st_size = syms[i + 1].st_value -
> >>>>>>> syms[i].st_value;
> >>>>>>> +            }
> >>>>>>> +        }
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> The size of the last symbol is not guesstimated, it could be assumed
> >>>>>> to be _etext - syms[nsyms].st_value.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>>    } else {
> >>>>>>>        qemu_free(syms);
> >>>>>>>        syms = NULL;
> >>>>>>> --
> >>>>>>> 1.7.1
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> The patch is still missing in qemu master.
> >>>>>  From the two feedbacks I did not read that anything needs to be
> >>>>> changed.
> >>>>> Was I wrong, or can it be applied?
> >>>>>
> >>>>
> >>>> Please fix the last symbol. Either we should fix all symbols or none,
> >>>> half fixed (OK, practically all) is not so great.
> >>>>
> >>>
> >>> The last symbol is one of several thousands, and most symbols don't need
> >>> a
> >>> fix,
> >>> so with my fix more than 99.9 or even 99.99 percent of all symbols are ok
> >>> :-)
> >>> If the last symbol happens to be wrong, there is still a high probability
> >>> that
> >>> nobody will notice this because it is unused by QEMU. The problem I faced
> >>> with
> >>> QEMU's disassembly came from symbols with an address followed by code.
> >>> Is there any code after the last symbol? I don't expect that. In a sorted
> >>> list
> >>> of symbols from the text segment, _etext should be the last symbols!
> >>>
> >>> I think that the small chance of a missing fix for the last symbol is in
> >>> no
> >>> relation
> >>> to the code needed.
> >>>
> >>> Even worse, I have no simple formula to guess a valid value for the last
> >>> symbol.
> >>> The formula you suggested (with the corrections I wrote in my reply) is
> >>> only
> >>> ok
> >>> if the last symbol is in the text segment. Usually there are also symbols
> >>> for data
> >>> in other segments, and in many cases these segments are located after the
> >>> text segment. In these cases the last symbol is not located in the text
> >>> segment
> >>> which makes guesses of its size much more complicated.
> >>>
> >>
> >> How about using _end then?
> >>
> >>
> >
> > Wouldn't _end be the last symbol then?
> 
> Right, _end should be the last one in any case. I'll apply the patch.

I'm not so sure that is the case. The load_symbols call throws away
symbols that are not typed as functions. The filtering is done
prior to the suggested size fixups so my guess is that _end is typically
gone when the suggested size fixup is done.

I'm not opposed to the patch though...

Cheers

Patch

diff --git a/hw/elf_ops.h b/hw/elf_ops.h
index 27d1ab9..0bd7235 100644
--- a/hw/elf_ops.h
+++ b/hw/elf_ops.h
@@ -153,6 +153,11 @@  static int glue(load_symbols, SZ)(struct elfhdr *ehdr, int fd, int must_swab,
         syms = qemu_realloc(syms, nsyms * sizeof(*syms));
 
         qsort(syms, nsyms, sizeof(*syms), glue(symcmp, SZ));
+        for (i = 0; i < nsyms - 1; i++) {
+            if (syms[i].st_size == 0) {
+                syms[i].st_size = syms[i + 1].st_value - syms[i].st_value;
+            }
+        }
     } else {
         qemu_free(syms);
         syms = NULL;