[05/11,v10] Add API to get memory mapping

Message ID	4F67FEB6.8060705@cn.fujitsu.com
State	New
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> Message-ID: <4F67FEB6.8060705@cn.fujitsu.com> Date: Tue, 20 Mar 2012 11:51:18 +0800 From: Wen Congyang <wency@cn.fujitsu.com> User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.9) Gecko/20100413 Fedora/3.0.4-2.fc13 Thunderbird/3.0.4 MIME-Version: 1.0 To: qemu-devel <qemu-devel@nongnu.org>, Jan Kiszka <jan.kiszka@siemens.com>, Luiz Capitulino <lcapitulino@redhat.com>, Anthony Liguori <aliguori@us.ibm.com>, Dave Anderson <anderson@redhat.com>, HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>, Eric Blake <eblake@redhat.com> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=ISO-8859-1 Subject: [Qemu-devel] [PATCH 05/11 v10] Add API to get memory mapping Precedence: list Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org

Wen Congyang March 20, 2012, 3:51 a.m. UTC

Add API to get all virtual address and physical address mapping.
If the guest doesn't use paging, the virtual address is equal to the phyical
address. The virtual address and physical address mapping is for gdb's user, and
it does not include the memory that is not referenced by the page table. So if
you want to use crash to anaylze the vmcore, please do not specify -p option.
the reason why the -p option is not default explicitly: guest machine in a
catastrophic state can have corrupted memory, which we cannot trust.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 memory_mapping.c |   34 ++++++++++++++++++++++++++++++++++
 memory_mapping.h |   15 +++++++++++++++
 2 files changed, 49 insertions(+), 0 deletions(-)

Hatayama, Daisuke March 23, 2012, 12:02 p.m. UTC | #1

From: Wen Congyang <wency@cn.fujitsu.com>
Subject: [PATCH 05/11 v10] Add API to get memory mapping
Date: Tue, 20 Mar 2012 11:51:18 +0800

> Add API to get all virtual address and physical address mapping.
> If the guest doesn't use paging, the virtual address is equal to the phyical
> address. The virtual address and physical address mapping is for gdb's user, and
> it does not include the memory that is not referenced by the page table. So if
> you want to use crash to anaylze the vmcore, please do not specify -p option.
> the reason why the -p option is not default explicitly: guest machine in a
> catastrophic state can have corrupted memory, which we cannot trust.
> 
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> ---
>  memory_mapping.c |   34 ++++++++++++++++++++++++++++++++++
>  memory_mapping.h |   15 +++++++++++++++
>  2 files changed, 49 insertions(+), 0 deletions(-)
> 
> diff --git a/memory_mapping.c b/memory_mapping.c
> index 718f271..b92e2f6 100644
> --- a/memory_mapping.c
> +++ b/memory_mapping.c
> @@ -164,3 +164,37 @@ void memory_mapping_list_init(MemoryMappingList *list)
>      list->last_mapping = NULL;
>      QTAILQ_INIT(&list->head);
>  }
> +
> +#if defined(CONFIG_HAVE_GET_MEMORY_MAPPING)
> +int qemu_get_guest_memory_mapping(MemoryMappingList *list)
> +{
> +    CPUArchState *env;
> +    RAMBlock *block;
> +    ram_addr_t offset, length;
> +    int ret;
> +    bool paging_mode;
> +
> +    paging_mode = cpu_paging_enabled(first_cpu);
> +    if (paging_mode) {

On SMP with (n)-CPUs, we can do this check at most (n)-times.

On Linux, user-mode tasks have differnet page tables. If refering to
one page table, we can get one user-mode task memory only. Considering
as much memory as possible, it's best to reference all CPUs with
paging enabled and walk all the page tables.

A problem is that linear addresses for user-mode tasks can inherently
conflicts. Different user-mode tasks can have the same linear
address. So, tools need to distinguish each PT_LOAD entry based on a
pair of linear address and physical address, not linear address
only. I don't know whether gdb does this.

> +        for (env = first_cpu; env != NULL; env = env->next_cpu) {
> +            ret = cpu_get_memory_mapping(list, env);
> +            if (ret < 0) {
> +                return -1;
> +            }
> +        }
> +        return 0;
> +    }
> +
> +    /*
> +     * If the guest doesn't use paging, the virtual address is equal to physical
> +     * address.
> +     */

IIRC, ACPI sleep state goes in real-mode. There might be another that
can go in real-mode. If execution enters this path in such situation,
linear addresses are meaningless. But this is really rare case.

> +    QLIST_FOREACH(block, &ram_list.blocks, next) {
> +        offset = block->offset;
> +        length = block->length;
> +        create_new_memory_mapping(list, offset, offset, length);
> +    }
> +
> +    return 0;
> +}

Thanks.
HATAYAMA, Daisuke

Wen Congyang March 26, 2012, 1:10 a.m. UTC | #2

At 03/23/2012 08:02 PM, HATAYAMA Daisuke Wrote:
> From: Wen Congyang <wency@cn.fujitsu.com>
> Subject: [PATCH 05/11 v10] Add API to get memory mapping
> Date: Tue, 20 Mar 2012 11:51:18 +0800
> 
>> Add API to get all virtual address and physical address mapping.
>> If the guest doesn't use paging, the virtual address is equal to the phyical
>> address. The virtual address and physical address mapping is for gdb's user, and
>> it does not include the memory that is not referenced by the page table. So if
>> you want to use crash to anaylze the vmcore, please do not specify -p option.
>> the reason why the -p option is not default explicitly: guest machine in a
>> catastrophic state can have corrupted memory, which we cannot trust.
>>
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> ---
>>  memory_mapping.c |   34 ++++++++++++++++++++++++++++++++++
>>  memory_mapping.h |   15 +++++++++++++++
>>  2 files changed, 49 insertions(+), 0 deletions(-)
>>
>> diff --git a/memory_mapping.c b/memory_mapping.c
>> index 718f271..b92e2f6 100644
>> --- a/memory_mapping.c
>> +++ b/memory_mapping.c
>> @@ -164,3 +164,37 @@ void memory_mapping_list_init(MemoryMappingList *list)
>>      list->last_mapping = NULL;
>>      QTAILQ_INIT(&list->head);
>>  }
>> +
>> +#if defined(CONFIG_HAVE_GET_MEMORY_MAPPING)
>> +int qemu_get_guest_memory_mapping(MemoryMappingList *list)
>> +{
>> +    CPUArchState *env;
>> +    RAMBlock *block;
>> +    ram_addr_t offset, length;
>> +    int ret;
>> +    bool paging_mode;
>> +
>> +    paging_mode = cpu_paging_enabled(first_cpu);
>> +    if (paging_mode) {
> 
> On SMP with (n)-CPUs, we can do this check at most (n)-times.
> 
> On Linux, user-mode tasks have differnet page tables. If refering to
> one page table, we can get one user-mode task memory only. Considering
> as much memory as possible, it's best to reference all CPUs with
> paging enabled and walk all the page tables.
> 
> A problem is that linear addresses for user-mode tasks can inherently
> conflicts. Different user-mode tasks can have the same linear
> address. So, tools need to distinguish each PT_LOAD entry based on a
> pair of linear address and physical address, not linear address
> only. I don't know whether gdb does this.

gdb only can process kernel space. Jan's gdb-python script may can process
user-mode tasks, but we should get user-mode task's register from the kernel
or note, and convest virtual address/linear address to physicall address.

> 
>> +        for (env = first_cpu; env != NULL; env = env->next_cpu) {
>> +            ret = cpu_get_memory_mapping(list, env);
>> +            if (ret < 0) {
>> +                return -1;
>> +            }
>> +        }
>> +        return 0;
>> +    }
>> +
>> +    /*
>> +     * If the guest doesn't use paging, the virtual address is equal to physical
>> +     * address.
>> +     */
> 
> IIRC, ACPI sleep state goes in real-mode. There might be another that
> can go in real-mode. If execution enters this path in such situation,
> linear addresses are meaningless. But this is really rare case.

I donot meet such case, and I donot know what should I do in this patch now.
So I donot change it now.

Thanks
Wen Congyang

> 
>> +    QLIST_FOREACH(block, &ram_list.blocks, next) {
>> +        offset = block->offset;
>> +        length = block->length;
>> +        create_new_memory_mapping(list, offset, offset, length);
>> +    }
>> +
>> +    return 0;
>> +}
> 
> Thanks.
> HATAYAMA, Daisuke
> 
>

Hatayama, Daisuke March 26, 2012, 2:31 a.m. UTC | #3

From: Wen Congyang <wency@cn.fujitsu.com>
Subject: Re: [PATCH 05/11 v10] Add API to get memory mapping
Date: Mon, 26 Mar 2012 09:10:52 +0800

> At 03/23/2012 08:02 PM, HATAYAMA Daisuke Wrote:
>> From: Wen Congyang <wency@cn.fujitsu.com>
>> Subject: [PATCH 05/11 v10] Add API to get memory mapping
>> Date: Tue, 20 Mar 2012 11:51:18 +0800
>> 
>>> Add API to get all virtual address and physical address mapping.
>>> If the guest doesn't use paging, the virtual address is equal to the phyical
>>> address. The virtual address and physical address mapping is for gdb's user, and
>>> it does not include the memory that is not referenced by the page table. So if
>>> you want to use crash to anaylze the vmcore, please do not specify -p option.
>>> the reason why the -p option is not default explicitly: guest machine in a
>>> catastrophic state can have corrupted memory, which we cannot trust.
>>>
>>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>>> ---
>>>  memory_mapping.c |   34 ++++++++++++++++++++++++++++++++++
>>>  memory_mapping.h |   15 +++++++++++++++
>>>  2 files changed, 49 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/memory_mapping.c b/memory_mapping.c
>>> index 718f271..b92e2f6 100644
>>> --- a/memory_mapping.c
>>> +++ b/memory_mapping.c
>>> @@ -164,3 +164,37 @@ void memory_mapping_list_init(MemoryMappingList *list)
>>>      list->last_mapping = NULL;
>>>      QTAILQ_INIT(&list->head);
>>>  }
>>> +
>>> +#if defined(CONFIG_HAVE_GET_MEMORY_MAPPING)
>>> +int qemu_get_guest_memory_mapping(MemoryMappingList *list)
>>> +{
>>> +    CPUArchState *env;
>>> +    RAMBlock *block;
>>> +    ram_addr_t offset, length;
>>> +    int ret;
>>> +    bool paging_mode;
>>> +
>>> +    paging_mode = cpu_paging_enabled(first_cpu);
>>> +    if (paging_mode) {
>> 
>> On SMP with (n)-CPUs, we can do this check at most (n)-times.
>> 
>> On Linux, user-mode tasks have differnet page tables. If refering to
>> one page table, we can get one user-mode task memory only. Considering
>> as much memory as possible, it's best to reference all CPUs with
>> paging enabled and walk all the page tables.
>> 
>> A problem is that linear addresses for user-mode tasks can inherently
>> conflicts. Different user-mode tasks can have the same linear
>> address. So, tools need to distinguish each PT_LOAD entry based on a
>> pair of linear address and physical address, not linear address
>> only. I don't know whether gdb does this.
> 
> gdb only can process kernel space. Jan's gdb-python script may can process
> user-mode tasks, but we should get user-mode task's register from the kernel
> or note, and convest virtual address/linear address to physicall address.
> 

After I send this, I came up with the problem of page tabel coherency:
some page table has not updated yet so we see older ones. So if we use
all the page tables referenced by all CPUs, we face inconsistency of
some of the page tables. Essentially, we cannot avoid the issue that
we see the page table older than the actual even if we use only one
page table, but if restricting the use of page table to just one, we
can at least avoid the inconsistency of multiple page tables. In other
words, we can do paging processing normally though the table might be
old.

So, I think
- using page tables for all the CPUs at the same time is problematic.
- using only one page table of the exsiting CPUs is still safe.

How about the code like this?

  cpu = find_cpu_paging_enabled(env);
  if (cpu) {
     /* paging processing based the page table of the found cpu */
  }

Note that I of course consider these on the assumption that there's no
data corruption on the guest.

>> 
>>> +        for (env = first_cpu; env != NULL; env = env->next_cpu) {
>>> +            ret = cpu_get_memory_mapping(list, env);
>>> +            if (ret < 0) {
>>> +                return -1;
>>> +            }
>>> +        }
>>> +        return 0;
>>> +    }
>>> +
>>> +    /*
>>> +     * If the guest doesn't use paging, the virtual address is equal to physical
>>> +     * address.
>>> +     */
>> 
>> IIRC, ACPI sleep state goes in real-mode. There might be another that
>> can go in real-mode. If execution enters this path in such situation,
>> linear addresses are meaningless. But this is really rare case.
> 
> I donot meet such case, and I donot know what should I do in this patch now.
> So I donot change it now.
> 

Yes, we cannot see this because we are outside the guest kernel as
long as the guest tells us that. But writing memo about this anyware
would be necessary for the one case that paging doesn't work well.

Thanks.
HATAYAMA, Daisuke

Wen Congyang March 26, 2012, 2:44 a.m. UTC | #4

At 03/26/2012 10:31 AM, HATAYAMA Daisuke Wrote:
> From: Wen Congyang <wency@cn.fujitsu.com>
> Subject: Re: [PATCH 05/11 v10] Add API to get memory mapping
> Date: Mon, 26 Mar 2012 09:10:52 +0800
> 
>> At 03/23/2012 08:02 PM, HATAYAMA Daisuke Wrote:
>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>> Subject: [PATCH 05/11 v10] Add API to get memory mapping
>>> Date: Tue, 20 Mar 2012 11:51:18 +0800
>>>
>>>> Add API to get all virtual address and physical address mapping.
>>>> If the guest doesn't use paging, the virtual address is equal to the phyical
>>>> address. The virtual address and physical address mapping is for gdb's user, and
>>>> it does not include the memory that is not referenced by the page table. So if
>>>> you want to use crash to anaylze the vmcore, please do not specify -p option.
>>>> the reason why the -p option is not default explicitly: guest machine in a
>>>> catastrophic state can have corrupted memory, which we cannot trust.
>>>>
>>>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>>>> ---
>>>>  memory_mapping.c |   34 ++++++++++++++++++++++++++++++++++
>>>>  memory_mapping.h |   15 +++++++++++++++
>>>>  2 files changed, 49 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/memory_mapping.c b/memory_mapping.c
>>>> index 718f271..b92e2f6 100644
>>>> --- a/memory_mapping.c
>>>> +++ b/memory_mapping.c
>>>> @@ -164,3 +164,37 @@ void memory_mapping_list_init(MemoryMappingList *list)
>>>>      list->last_mapping = NULL;
>>>>      QTAILQ_INIT(&list->head);
>>>>  }
>>>> +
>>>> +#if defined(CONFIG_HAVE_GET_MEMORY_MAPPING)
>>>> +int qemu_get_guest_memory_mapping(MemoryMappingList *list)
>>>> +{
>>>> +    CPUArchState *env;
>>>> +    RAMBlock *block;
>>>> +    ram_addr_t offset, length;
>>>> +    int ret;
>>>> +    bool paging_mode;
>>>> +
>>>> +    paging_mode = cpu_paging_enabled(first_cpu);
>>>> +    if (paging_mode) {
>>>
>>> On SMP with (n)-CPUs, we can do this check at most (n)-times.
>>>
>>> On Linux, user-mode tasks have differnet page tables. If refering to
>>> one page table, we can get one user-mode task memory only. Considering
>>> as much memory as possible, it's best to reference all CPUs with
>>> paging enabled and walk all the page tables.
>>>
>>> A problem is that linear addresses for user-mode tasks can inherently
>>> conflicts. Different user-mode tasks can have the same linear
>>> address. So, tools need to distinguish each PT_LOAD entry based on a
>>> pair of linear address and physical address, not linear address
>>> only. I don't know whether gdb does this.
>>
>> gdb only can process kernel space. Jan's gdb-python script may can process
>> user-mode tasks, but we should get user-mode task's register from the kernel
>> or note, and convest virtual address/linear address to physicall address.
>>
> 
> After I send this, I came up with the problem of page tabel coherency:
> some page table has not updated yet so we see older ones. So if we use

Tha page table is older? Do you mean the newest page table is in TLB and
is not flushed to memory?

> all the page tables referenced by all CPUs, we face inconsistency of
> some of the page tables. Essentially, we cannot avoid the issue that
> we see the page table older than the actual even if we use only one
> page table, but if restricting the use of page table to just one, we
> can at least avoid the inconsistency of multiple page tables. In other
> words, we can do paging processing normally though the table might be
> old.
> 
> So, I think
> - using page tables for all the CPUs at the same time is problematic.
> - using only one page table of the exsiting CPUs is still safe.
> 
> How about the code like this?
> 
>   cpu = find_cpu_paging_enabled(env);

If there are more than two cpu's paging is enabled, which cpu should be choosed?
We cannot say which one is better than another one.

>   if (cpu) {
>      /* paging processing based the page table of the found cpu */
>   }
> 
> Note that I of course consider these on the assumption that there's no
> data corruption on the guest.

I know. If the data is corrupted, we should trust the page table.

> 
>>>
>>>> +        for (env = first_cpu; env != NULL; env = env->next_cpu) {
>>>> +            ret = cpu_get_memory_mapping(list, env);
>>>> +            if (ret < 0) {
>>>> +                return -1;
>>>> +            }
>>>> +        }
>>>> +        return 0;
>>>> +    }
>>>> +
>>>> +    /*
>>>> +     * If the guest doesn't use paging, the virtual address is equal to physical
>>>> +     * address.
>>>> +     */
>>>
>>> IIRC, ACPI sleep state goes in real-mode. There might be another that
>>> can go in real-mode. If execution enters this path in such situation,
>>> linear addresses are meaningless. But this is really rare case.
>>
>> I donot meet such case, and I donot know what should I do in this patch now.
>> So I donot change it now.
>>
> 
> Yes, we cannot see this because we are outside the guest kernel as
> long as the guest tells us that. But writing memo about this anyware
> would be necessary for the one case that paging doesn't work well.

OK.

Thanks
Wen Congyang

> 
> Thanks.
> HATAYAMA, Daisuke
> 
>

Hatayama, Daisuke March 27, 2012, 1:01 a.m. UTC | #5

From: Wen Congyang <wency@cn.fujitsu.com>
Subject: Re: [PATCH 05/11 v10] Add API to get memory mapping
Date: Mon, 26 Mar 2012 10:44:40 +0800

> At 03/26/2012 10:31 AM, HATAYAMA Daisuke Wrote:
>> From: Wen Congyang <wency@cn.fujitsu.com>
>> Subject: Re: [PATCH 05/11 v10] Add API to get memory mapping
>> Date: Mon, 26 Mar 2012 09:10:52 +0800
>> 
>>> At 03/23/2012 08:02 PM, HATAYAMA Daisuke Wrote:
>>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>>> Subject: [PATCH 05/11 v10] Add API to get memory mapping
>>>> Date: Tue, 20 Mar 2012 11:51:18 +0800
>>>>
>>>>> Add API to get all virtual address and physical address mapping.
>>>>> If the guest doesn't use paging, the virtual address is equal to the phyical
>>>>> address. The virtual address and physical address mapping is for gdb's user, and
>>>>> it does not include the memory that is not referenced by the page table. So if
>>>>> you want to use crash to anaylze the vmcore, please do not specify -p option.
>>>>> the reason why the -p option is not default explicitly: guest machine in a
>>>>> catastrophic state can have corrupted memory, which we cannot trust.
>>>>>
>>>>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>>>>> ---
>>>>>  memory_mapping.c |   34 ++++++++++++++++++++++++++++++++++
>>>>>  memory_mapping.h |   15 +++++++++++++++
>>>>>  2 files changed, 49 insertions(+), 0 deletions(-)
>>>>>
>>>>> diff --git a/memory_mapping.c b/memory_mapping.c
>>>>> index 718f271..b92e2f6 100644
>>>>> --- a/memory_mapping.c
>>>>> +++ b/memory_mapping.c
>>>>> @@ -164,3 +164,37 @@ void memory_mapping_list_init(MemoryMappingList *list)
>>>>>      list->last_mapping = NULL;
>>>>>      QTAILQ_INIT(&list->head);
>>>>>  }
>>>>> +
>>>>> +#if defined(CONFIG_HAVE_GET_MEMORY_MAPPING)
>>>>> +int qemu_get_guest_memory_mapping(MemoryMappingList *list)
>>>>> +{
>>>>> +    CPUArchState *env;
>>>>> +    RAMBlock *block;
>>>>> +    ram_addr_t offset, length;
>>>>> +    int ret;
>>>>> +    bool paging_mode;
>>>>> +
>>>>> +    paging_mode = cpu_paging_enabled(first_cpu);
>>>>> +    if (paging_mode) {
>>>>
>>>> On SMP with (n)-CPUs, we can do this check at most (n)-times.
>>>>
>>>> On Linux, user-mode tasks have differnet page tables. If refering to
>>>> one page table, we can get one user-mode task memory only. Considering
>>>> as much memory as possible, it's best to reference all CPUs with
>>>> paging enabled and walk all the page tables.
>>>>
>>>> A problem is that linear addresses for user-mode tasks can inherently
>>>> conflicts. Different user-mode tasks can have the same linear
>>>> address. So, tools need to distinguish each PT_LOAD entry based on a
>>>> pair of linear address and physical address, not linear address
>>>> only. I don't know whether gdb does this.
>>>
>>> gdb only can process kernel space. Jan's gdb-python script may can process
>>> user-mode tasks, but we should get user-mode task's register from the kernel
>>> or note, and convest virtual address/linear address to physicall address.
>>>
>> 
>> After I send this, I came up with the problem of page tabel coherency:
>> some page table has not updated yet so we see older ones. So if we use
> 
> Tha page table is older? Do you mean the newest page table is in TLB and
> is not flushed to memory?
> 

I say vmalloc() in most part. (to be honest I don't know other
possibility now) In stable state of kernel, page tables are allocated
when user processes are created (around dup_mm()?, IIRC), where part
for kernel space is copied from init_mm.pgd. They are updated at
runtime coherently from init_mm.pgd when page fault happens. I
expressed the page table that has not updated yet as old. For this
reason, paging can lead to different result for differnet CPU.

>> all the page tables referenced by all CPUs, we face inconsistency of
>> some of the page tables. Essentially, we cannot avoid the issue that
>> we see the page table older than the actual even if we use only one
>> page table, but if restricting the use of page table to just one, we
>> can at least avoid the inconsistency of multiple page tables. In other
>> words, we can do paging processing normally though the table might be
>> old.
>> 
>> So, I think
>> - using page tables for all the CPUs at the same time is problematic.
>> - using only one page table of the exsiting CPUs is still safe.
>> 
>> How about the code like this?
>> 
>>   cpu = find_cpu_paging_enabled(env);
> 
> If there are more than two cpu's paging is enabled, which cpu should be choosed?
> We cannot say which one is better than another one.
> 

I think so too. But now it sees only one CPU. Seeing all CPUs in order
can increase possibility to do paging, which must be better if users
want to do paging.

Thanks.
HATAYAMA, Daisuke

Wen Congyang March 27, 2012, 1:25 a.m. UTC | #6

At 03/27/2012 09:01 AM, HATAYAMA Daisuke Wrote:
> From: Wen Congyang <wency@cn.fujitsu.com>
> Subject: Re: [PATCH 05/11 v10] Add API to get memory mapping
> Date: Mon, 26 Mar 2012 10:44:40 +0800
> 
>> At 03/26/2012 10:31 AM, HATAYAMA Daisuke Wrote:
>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>> Subject: Re: [PATCH 05/11 v10] Add API to get memory mapping
>>> Date: Mon, 26 Mar 2012 09:10:52 +0800
>>>
>>>> At 03/23/2012 08:02 PM, HATAYAMA Daisuke Wrote:
>>>>> From: Wen Congyang <wency@cn.fujitsu.com>
>>>>> Subject: [PATCH 05/11 v10] Add API to get memory mapping
>>>>> Date: Tue, 20 Mar 2012 11:51:18 +0800
>>>>>
>>>>>> Add API to get all virtual address and physical address mapping.
>>>>>> If the guest doesn't use paging, the virtual address is equal to the phyical
>>>>>> address. The virtual address and physical address mapping is for gdb's user, and
>>>>>> it does not include the memory that is not referenced by the page table. So if
>>>>>> you want to use crash to anaylze the vmcore, please do not specify -p option.
>>>>>> the reason why the -p option is not default explicitly: guest machine in a
>>>>>> catastrophic state can have corrupted memory, which we cannot trust.
>>>>>>
>>>>>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>>>>>> ---
>>>>>>  memory_mapping.c |   34 ++++++++++++++++++++++++++++++++++
>>>>>>  memory_mapping.h |   15 +++++++++++++++
>>>>>>  2 files changed, 49 insertions(+), 0 deletions(-)
>>>>>>
>>>>>> diff --git a/memory_mapping.c b/memory_mapping.c
>>>>>> index 718f271..b92e2f6 100644
>>>>>> --- a/memory_mapping.c
>>>>>> +++ b/memory_mapping.c
>>>>>> @@ -164,3 +164,37 @@ void memory_mapping_list_init(MemoryMappingList *list)
>>>>>>      list->last_mapping = NULL;
>>>>>>      QTAILQ_INIT(&list->head);
>>>>>>  }
>>>>>> +
>>>>>> +#if defined(CONFIG_HAVE_GET_MEMORY_MAPPING)
>>>>>> +int qemu_get_guest_memory_mapping(MemoryMappingList *list)
>>>>>> +{
>>>>>> +    CPUArchState *env;
>>>>>> +    RAMBlock *block;
>>>>>> +    ram_addr_t offset, length;
>>>>>> +    int ret;
>>>>>> +    bool paging_mode;
>>>>>> +
>>>>>> +    paging_mode = cpu_paging_enabled(first_cpu);
>>>>>> +    if (paging_mode) {
>>>>>
>>>>> On SMP with (n)-CPUs, we can do this check at most (n)-times.
>>>>>
>>>>> On Linux, user-mode tasks have differnet page tables. If refering to
>>>>> one page table, we can get one user-mode task memory only. Considering
>>>>> as much memory as possible, it's best to reference all CPUs with
>>>>> paging enabled and walk all the page tables.
>>>>>
>>>>> A problem is that linear addresses for user-mode tasks can inherently
>>>>> conflicts. Different user-mode tasks can have the same linear
>>>>> address. So, tools need to distinguish each PT_LOAD entry based on a
>>>>> pair of linear address and physical address, not linear address
>>>>> only. I don't know whether gdb does this.
>>>>
>>>> gdb only can process kernel space. Jan's gdb-python script may can process
>>>> user-mode tasks, but we should get user-mode task's register from the kernel
>>>> or note, and convest virtual address/linear address to physicall address.
>>>>
>>>
>>> After I send this, I came up with the problem of page tabel coherency:
>>> some page table has not updated yet so we see older ones. So if we use
>>
>> Tha page table is older? Do you mean the newest page table is in TLB and
>> is not flushed to memory?
>>
> 
> I say vmalloc() in most part. (to be honest I don't know other
> possibility now) In stable state of kernel, page tables are allocated
> when user processes are created (around dup_mm()?, IIRC), where part
> for kernel space is copied from init_mm.pgd. They are updated at
> runtime coherently from init_mm.pgd when page fault happens. I
> expressed the page table that has not updated yet as old. For this
> reason, paging can lead to different result for differnet CPU.

Yes, paging can lead to different result for differnet CPU.
But the paging purpose is: allow the user debug kernel by
using gdb to process the core file. If the user want to debug
the user process in the core file, he can get the user process's
core from vmcore by using crash's extend gcore module.

> 
>>> all the page tables referenced by all CPUs, we face inconsistency of
>>> some of the page tables. Essentially, we cannot avoid the issue that
>>> we see the page table older than the actual even if we use only one
>>> page table, but if restricting the use of page table to just one, we
>>> can at least avoid the inconsistency of multiple page tables. In other
>>> words, we can do paging processing normally though the table might be
>>> old.
>>>
>>> So, I think
>>> - using page tables for all the CPUs at the same time is problematic.
>>> - using only one page table of the exsiting CPUs is still safe.
>>>
>>> How about the code like this?
>>>
>>>   cpu = find_cpu_paging_enabled(env);
>>
>> If there are more than two cpu's paging is enabled, which cpu should be choosed?
>> We cannot say which one is better than another one.
>>
> 
> I think so too. But now it sees only one CPU. Seeing all CPUs in order
> can increase possibility to do paging, which must be better if users
> want to do paging.

OK. I understand what do you say now. I will update patch 05/12.

Thanks
Wen Congyang

> 
> Thanks.
> HATAYAMA, Daisuke
> 
>

[05/11,v10] Add API to get memory mapping

Commit Message

Comments

Patch