Patchwork [v3,06/12] memory-hotplug: unregister memory section on SPARSEMEM_VMEMMAP

login
register
mail settings
Submitter Wen Congyang
Date Nov. 1, 2012, 9:44 a.m.
Message ID <1351763083-7905-7-git-send-email-wency@cn.fujitsu.com>
Download mbox | patch
Permalink /patch/196120/
State Not Applicable
Delegated to: David Miller
Headers show

Comments

Wen Congyang - Nov. 1, 2012, 9:44 a.m.
From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>

Currently __remove_section for SPARSEMEM_VMEMMAP does nothing. But even if
we use SPARSEMEM_VMEMMAP, we can unregister the memory_section.

So the patch add unregister_memory_section() into __remove_section().

CC: David Rientjes <rientjes@google.com>
CC: Jiang Liu <liuj97@gmail.com>
CC: Len Brown <len.brown@intel.com>
CC: Christoph Lameter <cl@linux.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
CC: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
---
 mm/memory_hotplug.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)
Jaegeuk Hanse - Nov. 20, 2012, 6:22 a.m.
On 11/01/2012 05:44 PM, Wen Congyang wrote:
> From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>
> Currently __remove_section for SPARSEMEM_VMEMMAP does nothing. But even if
> we use SPARSEMEM_VMEMMAP, we can unregister the memory_section.
>
> So the patch add unregister_memory_section() into __remove_section().

Hi Yasuaki,

In order to review this patch, I should dig sparse memory codes in 
advance. But I have some confuse of codes. Why need encode/decode mem 
map instead of set mem_map to ms->section_mem_map directly?

Regards,
Jaegeuk

>
> CC: David Rientjes <rientjes@google.com>
> CC: Jiang Liu <liuj97@gmail.com>
> CC: Len Brown <len.brown@intel.com>
> CC: Christoph Lameter <cl@linux.com>
> Cc: Minchan Kim <minchan.kim@gmail.com>
> CC: Andrew Morton <akpm@linux-foundation.org>
> CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> CC: Wen Congyang <wency@cn.fujitsu.com>
> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
> ---
>   mm/memory_hotplug.c | 13 ++++++++-----
>   1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index ca07433..66a79a7 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -286,11 +286,14 @@ static int __meminit __add_section(int nid, struct zone *zone,
>   #ifdef CONFIG_SPARSEMEM_VMEMMAP
>   static int __remove_section(struct zone *zone, struct mem_section *ms)
>   {
> -	/*
> -	 * XXX: Freeing memmap with vmemmap is not implement yet.
> -	 *      This should be removed later.
> -	 */
> -	return -EBUSY;
> +	int ret = -EINVAL;
> +
> +	if (!valid_section(ms))
> +		return ret;
> +
> +	ret = unregister_memory_section(ms);
> +
> +	return ret;
>   }
>   #else
>   static int __remove_section(struct zone *zone, struct mem_section *ms)

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wen Congyang - Nov. 20, 2012, 6:55 a.m.
At 11/20/2012 02:22 PM, Jaegeuk Hanse Wrote:
> On 11/01/2012 05:44 PM, Wen Congyang wrote:
>> From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>>
>> Currently __remove_section for SPARSEMEM_VMEMMAP does nothing. But
>> even if
>> we use SPARSEMEM_VMEMMAP, we can unregister the memory_section.
>>
>> So the patch add unregister_memory_section() into __remove_section().
> 
> Hi Yasuaki,
> 
> In order to review this patch, I should dig sparse memory codes in
> advance. But I have some confuse of codes. Why need encode/decode mem
> map instead of set mem_map to ms->section_mem_map directly?

The memmap is aligned, and the low bits are zero. We store some information
in these bits. So we need to encode/decode memmap here.

Thanks
Wen Congyang

> 
> Regards,
> Jaegeuk
> 
>>
>> CC: David Rientjes <rientjes@google.com>
>> CC: Jiang Liu <liuj97@gmail.com>
>> CC: Len Brown <len.brown@intel.com>
>> CC: Christoph Lameter <cl@linux.com>
>> Cc: Minchan Kim <minchan.kim@gmail.com>
>> CC: Andrew Morton <akpm@linux-foundation.org>
>> CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>> CC: Wen Congyang <wency@cn.fujitsu.com>
>> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>> ---
>>   mm/memory_hotplug.c | 13 ++++++++-----
>>   1 file changed, 8 insertions(+), 5 deletions(-)
>>
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index ca07433..66a79a7 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -286,11 +286,14 @@ static int __meminit __add_section(int nid,
>> struct zone *zone,
>>   #ifdef CONFIG_SPARSEMEM_VMEMMAP
>>   static int __remove_section(struct zone *zone, struct mem_section *ms)
>>   {
>> -    /*
>> -     * XXX: Freeing memmap with vmemmap is not implement yet.
>> -     *      This should be removed later.
>> -     */
>> -    return -EBUSY;
>> +    int ret = -EINVAL;
>> +
>> +    if (!valid_section(ms))
>> +        return ret;
>> +
>> +    ret = unregister_memory_section(ms);
>> +
>> +    return ret;
>>   }
>>   #else
>>   static int __remove_section(struct zone *zone, struct mem_section *ms)
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jaegeuk Hanse - Nov. 20, 2012, 6:58 a.m.
On 11/20/2012 02:55 PM, Wen Congyang wrote:
> At 11/20/2012 02:22 PM, Jaegeuk Hanse Wrote:
>> On 11/01/2012 05:44 PM, Wen Congyang wrote:
>>> From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>>>
>>> Currently __remove_section for SPARSEMEM_VMEMMAP does nothing. But
>>> even if
>>> we use SPARSEMEM_VMEMMAP, we can unregister the memory_section.
>>>
>>> So the patch add unregister_memory_section() into __remove_section().
>> Hi Yasuaki,
>>
>> In order to review this patch, I should dig sparse memory codes in
>> advance. But I have some confuse of codes. Why need encode/decode mem
>> map instead of set mem_map to ms->section_mem_map directly?
> The memmap is aligned, and the low bits are zero. We store some information
> in these bits. So we need to encode/decode memmap here.

Hi Congyang,

Thanks for you reponse. But I mean why return (unsigned long)(mem_map - 
(section_nr_to_pfn(pnum))); in function sparse_encode_mem_map, and then 
return ((struct page *)coded_mem_map) + section_nr_to_pfn(pnum); in 
funtion sparse_decode_mem_map instead of just store mem_map in 
ms->section_mep_map directly.

Regards,
Jaegeuk

>
> Thanks
> Wen Congyang
>
>> Regards,
>> Jaegeuk
>>
>>> CC: David Rientjes <rientjes@google.com>
>>> CC: Jiang Liu <liuj97@gmail.com>
>>> CC: Len Brown <len.brown@intel.com>
>>> CC: Christoph Lameter <cl@linux.com>
>>> Cc: Minchan Kim <minchan.kim@gmail.com>
>>> CC: Andrew Morton <akpm@linux-foundation.org>
>>> CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>>> CC: Wen Congyang <wency@cn.fujitsu.com>
>>> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>>> ---
>>>    mm/memory_hotplug.c | 13 ++++++++-----
>>>    1 file changed, 8 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>>> index ca07433..66a79a7 100644
>>> --- a/mm/memory_hotplug.c
>>> +++ b/mm/memory_hotplug.c
>>> @@ -286,11 +286,14 @@ static int __meminit __add_section(int nid,
>>> struct zone *zone,
>>>    #ifdef CONFIG_SPARSEMEM_VMEMMAP
>>>    static int __remove_section(struct zone *zone, struct mem_section *ms)
>>>    {
>>> -    /*
>>> -     * XXX: Freeing memmap with vmemmap is not implement yet.
>>> -     *      This should be removed later.
>>> -     */
>>> -    return -EBUSY;
>>> +    int ret = -EINVAL;
>>> +
>>> +    if (!valid_section(ms))
>>> +        return ret;
>>> +
>>> +    ret = unregister_memory_section(ms);
>>> +
>>> +    return ret;
>>>    }
>>>    #else
>>>    static int __remove_section(struct zone *zone, struct mem_section *ms)
>>

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wen Congyang - Nov. 20, 2012, 9:37 a.m.
At 11/20/2012 02:58 PM, Jaegeuk Hanse Wrote:
> On 11/20/2012 02:55 PM, Wen Congyang wrote:
>> At 11/20/2012 02:22 PM, Jaegeuk Hanse Wrote:
>>> On 11/01/2012 05:44 PM, Wen Congyang wrote:
>>>> From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>>>>
>>>> Currently __remove_section for SPARSEMEM_VMEMMAP does nothing. But
>>>> even if
>>>> we use SPARSEMEM_VMEMMAP, we can unregister the memory_section.
>>>>
>>>> So the patch add unregister_memory_section() into __remove_section().
>>> Hi Yasuaki,
>>>
>>> In order to review this patch, I should dig sparse memory codes in
>>> advance. But I have some confuse of codes. Why need encode/decode mem
>>> map instead of set mem_map to ms->section_mem_map directly?
>> The memmap is aligned, and the low bits are zero. We store some
>> information
>> in these bits. So we need to encode/decode memmap here.
> 
> Hi Congyang,
> 
> Thanks for you reponse. But I mean why return (unsigned long)(mem_map -
> (section_nr_to_pfn(pnum))); in function sparse_encode_mem_map, and then
> return ((struct page *)coded_mem_map) + section_nr_to_pfn(pnum); in
> funtion sparse_decode_mem_map instead of just store mem_map in
> ms->section_mep_map directly.

I don't know why. I try to find the reason, but I don't find any
place to use the pfn stored in the mem_map except in the decode
function. Maybe the designer doesn't want us to access the mem_map
directly.

Thanks
Wen Congyang

> 
> Regards,
> Jaegeuk
> 
>>
>> Thanks
>> Wen Congyang
>>
>>> Regards,
>>> Jaegeuk
>>>
>>>> CC: David Rientjes <rientjes@google.com>
>>>> CC: Jiang Liu <liuj97@gmail.com>
>>>> CC: Len Brown <len.brown@intel.com>
>>>> CC: Christoph Lameter <cl@linux.com>
>>>> Cc: Minchan Kim <minchan.kim@gmail.com>
>>>> CC: Andrew Morton <akpm@linux-foundation.org>
>>>> CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>>>> CC: Wen Congyang <wency@cn.fujitsu.com>
>>>> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>>>> ---
>>>>    mm/memory_hotplug.c | 13 ++++++++-----
>>>>    1 file changed, 8 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>>>> index ca07433..66a79a7 100644
>>>> --- a/mm/memory_hotplug.c
>>>> +++ b/mm/memory_hotplug.c
>>>> @@ -286,11 +286,14 @@ static int __meminit __add_section(int nid,
>>>> struct zone *zone,
>>>>    #ifdef CONFIG_SPARSEMEM_VMEMMAP
>>>>    static int __remove_section(struct zone *zone, struct mem_section
>>>> *ms)
>>>>    {
>>>> -    /*
>>>> -     * XXX: Freeing memmap with vmemmap is not implement yet.
>>>> -     *      This should be removed later.
>>>> -     */
>>>> -    return -EBUSY;
>>>> +    int ret = -EINVAL;
>>>> +
>>>> +    if (!valid_section(ms))
>>>> +        return ret;
>>>> +
>>>> +    ret = unregister_memory_section(ms);
>>>> +
>>>> +    return ret;
>>>>    }
>>>>    #else
>>>>    static int __remove_section(struct zone *zone, struct mem_section
>>>> *ms)
>>>
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jaegeuk Hanse - Nov. 20, 2012, 11:03 a.m.
On 11/20/2012 05:37 PM, Wen Congyang wrote:
> At 11/20/2012 02:58 PM, Jaegeuk Hanse Wrote:
>> On 11/20/2012 02:55 PM, Wen Congyang wrote:
>>> At 11/20/2012 02:22 PM, Jaegeuk Hanse Wrote:
>>>> On 11/01/2012 05:44 PM, Wen Congyang wrote:
>>>>> From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>>>>>
>>>>> Currently __remove_section for SPARSEMEM_VMEMMAP does nothing. But
>>>>> even if
>>>>> we use SPARSEMEM_VMEMMAP, we can unregister the memory_section.
>>>>>
>>>>> So the patch add unregister_memory_section() into __remove_section().
>>>> Hi Yasuaki,
>>>>
>>>> In order to review this patch, I should dig sparse memory codes in
>>>> advance. But I have some confuse of codes. Why need encode/decode mem
>>>> map instead of set mem_map to ms->section_mem_map directly?
>>> The memmap is aligned, and the low bits are zero. We store some
>>> information
>>> in these bits. So we need to encode/decode memmap here.
>> Hi Congyang,
>>
>> Thanks for you reponse. But I mean why return (unsigned long)(mem_map -
>> (section_nr_to_pfn(pnum))); in function sparse_encode_mem_map, and then
>> return ((struct page *)coded_mem_map) + section_nr_to_pfn(pnum); in
>> funtion sparse_decode_mem_map instead of just store mem_map in
>> ms->section_mep_map directly.
> I don't know why. I try to find the reason, but I don't find any
> place to use the pfn stored in the mem_map except in the decode
> function. Maybe the designer doesn't want us to access the mem_map
> directly.

It seems that mem_map is per node, but pfn is real pfn.
you can check __page_to_pfn.

>
> Thanks
> Wen Congyang
>
>> Regards,
>> Jaegeuk
>>
>>> Thanks
>>> Wen Congyang
>>>
>>>> Regards,
>>>> Jaegeuk
>>>>
>>>>> CC: David Rientjes <rientjes@google.com>
>>>>> CC: Jiang Liu <liuj97@gmail.com>
>>>>> CC: Len Brown <len.brown@intel.com>
>>>>> CC: Christoph Lameter <cl@linux.com>
>>>>> Cc: Minchan Kim <minchan.kim@gmail.com>
>>>>> CC: Andrew Morton <akpm@linux-foundation.org>
>>>>> CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>>>>> CC: Wen Congyang <wency@cn.fujitsu.com>
>>>>> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>>>>> ---
>>>>>     mm/memory_hotplug.c | 13 ++++++++-----
>>>>>     1 file changed, 8 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>>>>> index ca07433..66a79a7 100644
>>>>> --- a/mm/memory_hotplug.c
>>>>> +++ b/mm/memory_hotplug.c
>>>>> @@ -286,11 +286,14 @@ static int __meminit __add_section(int nid,
>>>>> struct zone *zone,
>>>>>     #ifdef CONFIG_SPARSEMEM_VMEMMAP
>>>>>     static int __remove_section(struct zone *zone, struct mem_section
>>>>> *ms)
>>>>>     {
>>>>> -    /*
>>>>> -     * XXX: Freeing memmap with vmemmap is not implement yet.
>>>>> -     *      This should be removed later.
>>>>> -     */
>>>>> -    return -EBUSY;
>>>>> +    int ret = -EINVAL;
>>>>> +
>>>>> +    if (!valid_section(ms))
>>>>> +        return ret;
>>>>> +
>>>>> +    ret = unregister_memory_section(ms);
>>>>> +
>>>>> +    return ret;
>>>>>     }
>>>>>     #else
>>>>>     static int __remove_section(struct zone *zone, struct mem_section
>>>>> *ms)
>>

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jaegeuk Hanse - Nov. 20, 2012, 11:16 a.m.
On 11/01/2012 05:44 PM, Wen Congyang wrote:
> From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>
> Currently __remove_section for SPARSEMEM_VMEMMAP does nothing. But even if
> we use SPARSEMEM_VMEMMAP, we can unregister the memory_section.
>
> So the patch add unregister_memory_section() into __remove_section().

Hi Yasuaki,

I have a question about these sparse vmemmap memory related patches. Hot 
add memory need allocated vmemmap pages, but this time is allocated by 
buddy system. How can gurantee virtual address is continuous to the 
address allocated before? If not continuous, page_to_pfn and pfn_to_page 
can't work correctly.

Regards,
Jaegeuk

>
> CC: David Rientjes <rientjes@google.com>
> CC: Jiang Liu <liuj97@gmail.com>
> CC: Len Brown <len.brown@intel.com>
> CC: Christoph Lameter <cl@linux.com>
> Cc: Minchan Kim <minchan.kim@gmail.com>
> CC: Andrew Morton <akpm@linux-foundation.org>
> CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> CC: Wen Congyang <wency@cn.fujitsu.com>
> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
> ---
>   mm/memory_hotplug.c | 13 ++++++++-----
>   1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index ca07433..66a79a7 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -286,11 +286,14 @@ static int __meminit __add_section(int nid, struct zone *zone,
>   #ifdef CONFIG_SPARSEMEM_VMEMMAP
>   static int __remove_section(struct zone *zone, struct mem_section *ms)
>   {
> -	/*
> -	 * XXX: Freeing memmap with vmemmap is not implement yet.
> -	 *      This should be removed later.
> -	 */
> -	return -EBUSY;
> +	int ret = -EINVAL;
> +
> +	if (!valid_section(ms))
> +		return ret;
> +
> +	ret = unregister_memory_section(ms);
> +
> +	return ret;
>   }
>   #else
>   static int __remove_section(struct zone *zone, struct mem_section *ms)

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wen Congyang - Nov. 21, 2012, 3:05 a.m.
At 11/20/2012 07:16 PM, Jaegeuk Hanse Wrote:
> On 11/01/2012 05:44 PM, Wen Congyang wrote:
>> From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>>
>> Currently __remove_section for SPARSEMEM_VMEMMAP does nothing. But
>> even if
>> we use SPARSEMEM_VMEMMAP, we can unregister the memory_section.
>>
>> So the patch add unregister_memory_section() into __remove_section().
> 
> Hi Yasuaki,
> 
> I have a question about these sparse vmemmap memory related patches. Hot
> add memory need allocated vmemmap pages, but this time is allocated by
> buddy system. How can gurantee virtual address is continuous to the
> address allocated before? If not continuous, page_to_pfn and pfn_to_page
> can't work correctly.

vmemmap has its virtual address range:
ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)

We allocate memory from buddy system to store struct page, and its virtual
address isn't in this range. So we should update the page table:

kmalloc_section_memmap()
    sparse_mem_map_populate()
        pfn_to_page() // get the virtual address in the vmemmap range
        vmemmap_populate() // we update page table here

When we use vmemmap, page_to_pfn() always returns address in the vmemmap
range, not the address that kmalloc() returns. So the virtual address
is continuous.

Thanks
Wen Congyang
> 
> Regards,
> Jaegeuk
> 
>>
>> CC: David Rientjes <rientjes@google.com>
>> CC: Jiang Liu <liuj97@gmail.com>
>> CC: Len Brown <len.brown@intel.com>
>> CC: Christoph Lameter <cl@linux.com>
>> Cc: Minchan Kim <minchan.kim@gmail.com>
>> CC: Andrew Morton <akpm@linux-foundation.org>
>> CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>> CC: Wen Congyang <wency@cn.fujitsu.com>
>> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>> ---
>>   mm/memory_hotplug.c | 13 ++++++++-----
>>   1 file changed, 8 insertions(+), 5 deletions(-)
>>
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index ca07433..66a79a7 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -286,11 +286,14 @@ static int __meminit __add_section(int nid,
>> struct zone *zone,
>>   #ifdef CONFIG_SPARSEMEM_VMEMMAP
>>   static int __remove_section(struct zone *zone, struct mem_section *ms)
>>   {
>> -    /*
>> -     * XXX: Freeing memmap with vmemmap is not implement yet.
>> -     *      This should be removed later.
>> -     */
>> -    return -EBUSY;
>> +    int ret = -EINVAL;
>> +
>> +    if (!valid_section(ms))
>> +        return ret;
>> +
>> +    ret = unregister_memory_section(ms);
>> +
>> +    return ret;
>>   }
>>   #else
>>   static int __remove_section(struct zone *zone, struct mem_section *ms)
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jaegeuk Hanse - Nov. 21, 2012, 4:22 a.m.
On 11/21/2012 11:05 AM, Wen Congyang wrote:
> At 11/20/2012 07:16 PM, Jaegeuk Hanse Wrote:
>> On 11/01/2012 05:44 PM, Wen Congyang wrote:
>>> From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>>>
>>> Currently __remove_section for SPARSEMEM_VMEMMAP does nothing. But
>>> even if
>>> we use SPARSEMEM_VMEMMAP, we can unregister the memory_section.
>>>
>>> So the patch add unregister_memory_section() into __remove_section().
>> Hi Yasuaki,
>>
>> I have a question about these sparse vmemmap memory related patches. Hot
>> add memory need allocated vmemmap pages, but this time is allocated by
>> buddy system. How can gurantee virtual address is continuous to the
>> address allocated before? If not continuous, page_to_pfn and pfn_to_page
>> can't work correctly.
> vmemmap has its virtual address range:
> ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)
>
> We allocate memory from buddy system to store struct page, and its virtual
> address isn't in this range. So we should update the page table:
>
> kmalloc_section_memmap()
>      sparse_mem_map_populate()
>          pfn_to_page() // get the virtual address in the vmemmap range
>          vmemmap_populate() // we update page table here
>
> When we use vmemmap, page_to_pfn() always returns address in the vmemmap
> range, not the address that kmalloc() returns. So the virtual address
> is continuous.

Hi Congyang,

Another question about memory hotplug. During hot remove memory, it will 
also call memblock_remove to remove related memblock.
memblock_remove()
            __memblock_remove()
                    memblock_isolate_range()
                    memblock_remove_region()

But memblock_isolate_range() only record fully contained regions, 
regions which are partial overlapped just be splitted instead of record. 
So these partial overlapped regions can't be removed. Where I miss?

Regards,
Jaegeuk

> Thanks
> Wen Congyang
>> Regards,
>> Jaegeuk
>>
>>> CC: David Rientjes <rientjes@google.com>
>>> CC: Jiang Liu <liuj97@gmail.com>
>>> CC: Len Brown <len.brown@intel.com>
>>> CC: Christoph Lameter <cl@linux.com>
>>> Cc: Minchan Kim <minchan.kim@gmail.com>
>>> CC: Andrew Morton <akpm@linux-foundation.org>
>>> CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>>> CC: Wen Congyang <wency@cn.fujitsu.com>
>>> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>>> ---
>>>    mm/memory_hotplug.c | 13 ++++++++-----
>>>    1 file changed, 8 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>>> index ca07433..66a79a7 100644
>>> --- a/mm/memory_hotplug.c
>>> +++ b/mm/memory_hotplug.c
>>> @@ -286,11 +286,14 @@ static int __meminit __add_section(int nid,
>>> struct zone *zone,
>>>    #ifdef CONFIG_SPARSEMEM_VMEMMAP
>>>    static int __remove_section(struct zone *zone, struct mem_section *ms)
>>>    {
>>> -    /*
>>> -     * XXX: Freeing memmap with vmemmap is not implement yet.
>>> -     *      This should be removed later.
>>> -     */
>>> -    return -EBUSY;
>>> +    int ret = -EINVAL;
>>> +
>>> +    if (!valid_section(ms))
>>> +        return ret;
>>> +
>>> +    ret = unregister_memory_section(ms);
>>> +
>>> +    return ret;
>>>    }
>>>    #else
>>>    static int __remove_section(struct zone *zone, struct mem_section *ms)
>>

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wen Congyang - Nov. 21, 2012, 4:42 a.m.
At 11/21/2012 12:22 PM, Jaegeuk Hanse Wrote:
> On 11/21/2012 11:05 AM, Wen Congyang wrote:
>> At 11/20/2012 07:16 PM, Jaegeuk Hanse Wrote:
>>> On 11/01/2012 05:44 PM, Wen Congyang wrote:
>>>> From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>>>>
>>>> Currently __remove_section for SPARSEMEM_VMEMMAP does nothing. But
>>>> even if
>>>> we use SPARSEMEM_VMEMMAP, we can unregister the memory_section.
>>>>
>>>> So the patch add unregister_memory_section() into __remove_section().
>>> Hi Yasuaki,
>>>
>>> I have a question about these sparse vmemmap memory related patches. Hot
>>> add memory need allocated vmemmap pages, but this time is allocated by
>>> buddy system. How can gurantee virtual address is continuous to the
>>> address allocated before? If not continuous, page_to_pfn and pfn_to_page
>>> can't work correctly.
>> vmemmap has its virtual address range:
>> ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)
>>
>> We allocate memory from buddy system to store struct page, and its
>> virtual
>> address isn't in this range. So we should update the page table:
>>
>> kmalloc_section_memmap()
>>      sparse_mem_map_populate()
>>          pfn_to_page() // get the virtual address in the vmemmap range
>>          vmemmap_populate() // we update page table here
>>
>> When we use vmemmap, page_to_pfn() always returns address in the vmemmap
>> range, not the address that kmalloc() returns. So the virtual address
>> is continuous.
> 
> Hi Congyang,
> 
> Another question about memory hotplug. During hot remove memory, it will
> also call memblock_remove to remove related memblock.

IIRC, we don't touch memblock when hot-add/hot-remove memory. memblock is
only used for bootmem allocator. I think it isn't used after booting.

> memblock_remove()
>            __memblock_remove()
>                    memblock_isolate_range()
>                    memblock_remove_region()
> 
> But memblock_isolate_range() only record fully contained regions,
> regions which are partial overlapped just be splitted instead of record.
> So these partial overlapped regions can't be removed. Where I miss?

No, memblock_isolate_range() can deal with partial overlapped region.
=====================
		if (rbase < base) {
			/*
			 * @rgn intersects from below.  Split and continue
			 * to process the next region - the new top half.
			 */
			rgn->base = base;
			rgn->size -= base - rbase;
			type->total_size -= base - rbase;
			memblock_insert_region(type, i, rbase, base - rbase,
					       memblock_get_region_node(rgn));
		} else if (rend > end) {
			/*
			 * @rgn intersects from above.  Split and redo the
			 * current region - the new bottom half.
			 */
			rgn->base = end;
			rgn->size -= end - rbase;
			type->total_size -= end - rbase;
			memblock_insert_region(type, i--, rbase, end - rbase,
					       memblock_get_region_node(rgn));
=====================

If the region is partial overlapped region, we will split the old region into
two regions. After doing this, it is full contained region now.

Thanks
Wen Congyang

> 
> Regards,
> Jaegeuk
> 
>> Thanks
>> Wen Congyang
>>> Regards,
>>> Jaegeuk
>>>
>>>> CC: David Rientjes <rientjes@google.com>
>>>> CC: Jiang Liu <liuj97@gmail.com>
>>>> CC: Len Brown <len.brown@intel.com>
>>>> CC: Christoph Lameter <cl@linux.com>
>>>> Cc: Minchan Kim <minchan.kim@gmail.com>
>>>> CC: Andrew Morton <akpm@linux-foundation.org>
>>>> CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>>>> CC: Wen Congyang <wency@cn.fujitsu.com>
>>>> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>>>> ---
>>>>    mm/memory_hotplug.c | 13 ++++++++-----
>>>>    1 file changed, 8 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>>>> index ca07433..66a79a7 100644
>>>> --- a/mm/memory_hotplug.c
>>>> +++ b/mm/memory_hotplug.c
>>>> @@ -286,11 +286,14 @@ static int __meminit __add_section(int nid,
>>>> struct zone *zone,
>>>>    #ifdef CONFIG_SPARSEMEM_VMEMMAP
>>>>    static int __remove_section(struct zone *zone, struct mem_section
>>>> *ms)
>>>>    {
>>>> -    /*
>>>> -     * XXX: Freeing memmap with vmemmap is not implement yet.
>>>> -     *      This should be removed later.
>>>> -     */
>>>> -    return -EBUSY;
>>>> +    int ret = -EINVAL;
>>>> +
>>>> +    if (!valid_section(ms))
>>>> +        return ret;
>>>> +
>>>> +    ret = unregister_memory_section(ms);
>>>> +
>>>> +    return ret;
>>>>    }
>>>>    #else
>>>>    static int __remove_section(struct zone *zone, struct mem_section
>>>> *ms)
>>>
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jaegeuk Hanse - Nov. 21, 2012, 5:03 a.m.
On 11/21/2012 12:42 PM, Wen Congyang wrote:
> At 11/21/2012 12:22 PM, Jaegeuk Hanse Wrote:
>> On 11/21/2012 11:05 AM, Wen Congyang wrote:
>>> At 11/20/2012 07:16 PM, Jaegeuk Hanse Wrote:
>>>> On 11/01/2012 05:44 PM, Wen Congyang wrote:
>>>>> From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>>>>>
>>>>> Currently __remove_section for SPARSEMEM_VMEMMAP does nothing. But
>>>>> even if
>>>>> we use SPARSEMEM_VMEMMAP, we can unregister the memory_section.
>>>>>
>>>>> So the patch add unregister_memory_section() into __remove_section().
>>>> Hi Yasuaki,
>>>>
>>>> I have a question about these sparse vmemmap memory related patches. Hot
>>>> add memory need allocated vmemmap pages, but this time is allocated by
>>>> buddy system. How can gurantee virtual address is continuous to the
>>>> address allocated before? If not continuous, page_to_pfn and pfn_to_page
>>>> can't work correctly.
>>> vmemmap has its virtual address range:
>>> ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)
>>>
>>> We allocate memory from buddy system to store struct page, and its
>>> virtual
>>> address isn't in this range. So we should update the page table:
>>>
>>> kmalloc_section_memmap()
>>>       sparse_mem_map_populate()
>>>           pfn_to_page() // get the virtual address in the vmemmap range
>>>           vmemmap_populate() // we update page table here
>>>
>>> When we use vmemmap, page_to_pfn() always returns address in the vmemmap
>>> range, not the address that kmalloc() returns. So the virtual address
>>> is continuous.
>> Hi Congyang,
>>
>> Another question about memory hotplug. During hot remove memory, it will
>> also call memblock_remove to remove related memblock.
> IIRC, we don't touch memblock when hot-add/hot-remove memory. memblock is
> only used for bootmem allocator. I think it isn't used after booting.

In IBM pseries servers.

pseries_remove_memory()
     pseries_remove_memblock()
         memblock_remove()

Furthermore, memblock is set to record available memory ranges get from 
e820 map(you can check it in memblock_x86_fill()) in x86 case, after 
hot-remove memory, this range of memory can't be available, why not 
remove them as pseries servers' codes do.

>> memblock_remove()
>>             __memblock_remove()memory-hotplug: unregister memory section on SPARSEMEM_VMEMMAP
>>
>>                     memblock_isolate_range()
>>                     memblock_remove_region()
>>
>> But memblock_isolate_range() only record fully contained regions,
>> regions which are partial overlapped just be splitted instead of record.
>> So these partial overlapped regions can't be removed. Where I miss?
> No, memblock_isolate_range() can deal with partial overlapped region.
> =====================
> 		if (rbase < base) {
> 			/*
> 			 * @rgn intersects from below.  Split and continue
> 			 * to process the next region - the new top half.
> 			 */
> 			rgn->base = base;
> 			rgn->size -= base - rbase;
> 			type->total_size -= base - rbase;
> 			memblock_insert_region(type, i, rbase, base - rbase,
> 					       memblock_get_region_node(rgn));
> 		} else if (rend > end) {
> 			/*
> 			 * @rgn intersects from above.  Split and redo the
> 			 * current region - the new bottom half.
> 			 */
> 			rgn->base = end;
> 			rgn->size -= end - rbase;
> 			type->total_size -= end - rbase;
> 			memblock_insert_region(type, i--, rbase, end - rbase,
> 					       memblock_get_region_node(rgn));
> =====================
>
> If the region is partial overlapped region, we will split the old region into
> two regions. After doing this, it is full contained region now.

You are right, I misunderstand the codes.

>
> Thanks
> Wen Congyang
>
>> Regards,
>> Jaegeuk
>>
>>> Thanks
>>> Wen Congyang
>>>> Regards,
>>>> Jaegeuk
>>>>
>>>>> CC: David Rientjes <rientjes@google.com>
>>>>> CC: Jiang Liu <liuj97@gmail.com>
>>>>> CC: Len Brown <len.brown@intel.com>
>>>>> CC: Christoph Lameter <cl@linux.com>
>>>>> Cc: Minchan Kim <minchan.kim@gmail.com>
>>>>> CC: Andrew Morton <akpm@linux-foundation.org>
>>>>> CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>>>>> CC: Wen Congyang <wency@cn.fujitsu.com>
>>>>> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>>>>> ---
>>>>>     mm/memory_hotplug.c | 13 ++++++++-----
>>>>>     1 file changed, 8 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>>>>> index ca07433..66a79a7 100644
>>>>> --- a/mm/memory_hotplug.c
>>>>> +++ b/mm/memory_hotplug.c
>>>>> @@ -286,11 +286,14 @@ static int __meminit __add_section(int nid,
>>>>> struct zone *zone,
>>>>>     #ifdef CONFIG_SPARSEMEM_VMEMMAP
>>>>>     static int __remove_section(struct zone *zone, struct mem_section
>>>>> *ms)
>>>>>     {
>>>>> -    /*
>>>>> -     * XXX: Freeing memmap with vmemmap is not implement yet.
>>>>> -     *      This should be removed later.
>>>>> -     */
>>>>> -    return -EBUSY;
>>>>> +    int ret = -EINVAL;
>>>>> +
>>>>> +    if (!valid_section(ms))
>>>>> +        return ret;
>>>>> +
>>>>> +    ret = unregister_memory_section(ms);
>>>>> +
>>>>> +    return ret;
>>>>>     }
>>>>>     #else
>>>>>     static int __remove_section(struct zone *zone, struct mem_section
>>>>> *ms)
>>

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wen Congyang - Nov. 21, 2012, 5:12 a.m.
At 11/21/2012 01:03 PM, Jaegeuk Hanse Wrote:
> On 11/21/2012 12:42 PM, Wen Congyang wrote:
>> At 11/21/2012 12:22 PM, Jaegeuk Hanse Wrote:
>>> On 11/21/2012 11:05 AM, Wen Congyang wrote:
>>>> At 11/20/2012 07:16 PM, Jaegeuk Hanse Wrote:
>>>>> On 11/01/2012 05:44 PM, Wen Congyang wrote:
>>>>>> From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>>>>>>
>>>>>> Currently __remove_section for SPARSEMEM_VMEMMAP does nothing. But
>>>>>> even if
>>>>>> we use SPARSEMEM_VMEMMAP, we can unregister the memory_section.
>>>>>>
>>>>>> So the patch add unregister_memory_section() into __remove_section().
>>>>> Hi Yasuaki,
>>>>>
>>>>> I have a question about these sparse vmemmap memory related
>>>>> patches. Hot
>>>>> add memory need allocated vmemmap pages, but this time is allocated by
>>>>> buddy system. How can gurantee virtual address is continuous to the
>>>>> address allocated before? If not continuous, page_to_pfn and
>>>>> pfn_to_page
>>>>> can't work correctly.
>>>> vmemmap has its virtual address range:
>>>> ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)
>>>>
>>>> We allocate memory from buddy system to store struct page, and its
>>>> virtual
>>>> address isn't in this range. So we should update the page table:
>>>>
>>>> kmalloc_section_memmap()
>>>>       sparse_mem_map_populate()
>>>>           pfn_to_page() // get the virtual address in the vmemmap range
>>>>           vmemmap_populate() // we update page table here
>>>>
>>>> When we use vmemmap, page_to_pfn() always returns address in the
>>>> vmemmap
>>>> range, not the address that kmalloc() returns. So the virtual address
>>>> is continuous.
>>> Hi Congyang,
>>>
>>> Another question about memory hotplug. During hot remove memory, it will
>>> also call memblock_remove to remove related memblock.
>> IIRC, we don't touch memblock when hot-add/hot-remove memory. memblock is
>> only used for bootmem allocator. I think it isn't used after booting.
> 
> In IBM pseries servers.
> 
> pseries_remove_memory()
>     pseries_remove_memblock()
>         memblock_remove()
> 
> Furthermore, memblock is set to record available memory ranges get from
> e820 map(you can check it in memblock_x86_fill()) in x86 case, after
> hot-remove memory, this range of memory can't be available, why not
> remove them as pseries servers' codes do.

Oh, it is powerpc, and I don't read this code. I will check it now.

Thanks for pointing it out.

Wen Congyang

> 
>>> memblock_remove()
>>>             __memblock_remove()memory-hotplug: unregister memory
>>> section on SPARSEMEM_VMEMMAP
>>>
>>>                     memblock_isolate_range()
>>>                     memblock_remove_region()
>>>
>>> But memblock_isolate_range() only record fully contained regions,
>>> regions which are partial overlapped just be splitted instead of record.
>>> So these partial overlapped regions can't be removed. Where I miss?
>> No, memblock_isolate_range() can deal with partial overlapped region.
>> =====================
>>         if (rbase < base) {
>>             /*
>>              * @rgn intersects from below.  Split and continue
>>              * to process the next region - the new top half.
>>              */
>>             rgn->base = base;
>>             rgn->size -= base - rbase;
>>             type->total_size -= base - rbase;
>>             memblock_insert_region(type, i, rbase, base - rbase,
>>                            memblock_get_region_node(rgn));
>>         } else if (rend > end) {
>>             /*
>>              * @rgn intersects from above.  Split and redo the
>>              * current region - the new bottom half.
>>              */
>>             rgn->base = end;
>>             rgn->size -= end - rbase;
>>             type->total_size -= end - rbase;
>>             memblock_insert_region(type, i--, rbase, end - rbase,
>>                            memblock_get_region_node(rgn));
>> =====================
>>
>> If the region is partial overlapped region, we will split the old
>> region into
>> two regions. After doing this, it is full contained region now.
> 
> You are right, I misunderstand the codes.
> 
>>
>> Thanks
>> Wen Congyang
>>
>>> Regards,
>>> Jaegeuk
>>>
>>>> Thanks
>>>> Wen Congyang
>>>>> Regards,
>>>>> Jaegeuk
>>>>>
>>>>>> CC: David Rientjes <rientjes@google.com>
>>>>>> CC: Jiang Liu <liuj97@gmail.com>
>>>>>> CC: Len Brown <len.brown@intel.com>
>>>>>> CC: Christoph Lameter <cl@linux.com>
>>>>>> Cc: Minchan Kim <minchan.kim@gmail.com>
>>>>>> CC: Andrew Morton <akpm@linux-foundation.org>
>>>>>> CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>>>>>> CC: Wen Congyang <wency@cn.fujitsu.com>
>>>>>> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>>>>>> ---
>>>>>>     mm/memory_hotplug.c | 13 ++++++++-----
>>>>>>     1 file changed, 8 insertions(+), 5 deletions(-)
>>>>>>
>>>>>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>>>>>> index ca07433..66a79a7 100644
>>>>>> --- a/mm/memory_hotplug.c
>>>>>> +++ b/mm/memory_hotplug.c
>>>>>> @@ -286,11 +286,14 @@ static int __meminit __add_section(int nid,
>>>>>> struct zone *zone,
>>>>>>     #ifdef CONFIG_SPARSEMEM_VMEMMAP
>>>>>>     static int __remove_section(struct zone *zone, struct mem_section
>>>>>> *ms)
>>>>>>     {
>>>>>> -    /*
>>>>>> -     * XXX: Freeing memmap with vmemmap is not implement yet.
>>>>>> -     *      This should be removed later.
>>>>>> -     */
>>>>>> -    return -EBUSY;
>>>>>> +    int ret = -EINVAL;
>>>>>> +
>>>>>> +    if (!valid_section(ms))
>>>>>> +        return ret;
>>>>>> +
>>>>>> +    ret = unregister_memory_section(ms);
>>>>>> +
>>>>>> +    return ret;
>>>>>>     }
>>>>>>     #else
>>>>>>     static int __remove_section(struct zone *zone, struct mem_section
>>>>>> *ms)
>>>
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wen Congyang - Nov. 21, 2012, 5:28 a.m.
At 11/21/2012 01:03 PM, Jaegeuk Hanse Wrote:
> On 11/21/2012 12:42 PM, Wen Congyang wrote:
>> At 11/21/2012 12:22 PM, Jaegeuk Hanse Wrote:
>>> On 11/21/2012 11:05 AM, Wen Congyang wrote:
>>>> At 11/20/2012 07:16 PM, Jaegeuk Hanse Wrote:
>>>>> On 11/01/2012 05:44 PM, Wen Congyang wrote:
>>>>>> From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>>>>>>
>>>>>> Currently __remove_section for SPARSEMEM_VMEMMAP does nothing. But
>>>>>> even if
>>>>>> we use SPARSEMEM_VMEMMAP, we can unregister the memory_section.
>>>>>>
>>>>>> So the patch add unregister_memory_section() into __remove_section().
>>>>> Hi Yasuaki,
>>>>>
>>>>> I have a question about these sparse vmemmap memory related
>>>>> patches. Hot
>>>>> add memory need allocated vmemmap pages, but this time is allocated by
>>>>> buddy system. How can gurantee virtual address is continuous to the
>>>>> address allocated before? If not continuous, page_to_pfn and
>>>>> pfn_to_page
>>>>> can't work correctly.
>>>> vmemmap has its virtual address range:
>>>> ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)
>>>>
>>>> We allocate memory from buddy system to store struct page, and its
>>>> virtual
>>>> address isn't in this range. So we should update the page table:
>>>>
>>>> kmalloc_section_memmap()
>>>>       sparse_mem_map_populate()
>>>>           pfn_to_page() // get the virtual address in the vmemmap range
>>>>           vmemmap_populate() // we update page table here
>>>>
>>>> When we use vmemmap, page_to_pfn() always returns address in the
>>>> vmemmap
>>>> range, not the address that kmalloc() returns. So the virtual address
>>>> is continuous.
>>> Hi Congyang,
>>>
>>> Another question about memory hotplug. During hot remove memory, it will
>>> also call memblock_remove to remove related memblock.
>> IIRC, we don't touch memblock when hot-add/hot-remove memory. memblock is
>> only used for bootmem allocator. I think it isn't used after booting.
> 
> In IBM pseries servers.
> 
> pseries_remove_memory()
>     pseries_remove_memblock()
>         memblock_remove()

It seems that pseries servers don't use ACPI(ACPI is only supported for
ia64 and x86 now. arm will be supported in the furture).

I am not ppc expert, and I don't know why we touch memblock when hotadding
memory in ppc case. But IIRC, we don't need memblock after the machine has
booted up in x86 case. So there is no need to touch it when hotadd/hotremove
the memory in x86 case.

Thanks
Wen Congyang

> 
> Furthermore, memblock is set to record available memory ranges get from
> e820 map(you can check it in memblock_x86_fill()) in x86 case, after
> hot-remove memory, this range of memory can't be available, why not
> remove them as pseries servers' codes do.
> 
>>> memblock_remove()
>>>             __memblock_remove()memory-hotplug: unregister memory
>>> section on SPARSEMEM_VMEMMAP
>>>
>>>                     memblock_isolate_range()
>>>                     memblock_remove_region()
>>>
>>> But memblock_isolate_range() only record fully contained regions,
>>> regions which are partial overlapped just be splitted instead of record.
>>> So these partial overlapped regions can't be removed. Where I miss?
>> No, memblock_isolate_range() can deal with partial overlapped region.
>> =====================
>>         if (rbase < base) {
>>             /*
>>              * @rgn intersects from below.  Split and continue
>>              * to process the next region - the new top half.
>>              */
>>             rgn->base = base;
>>             rgn->size -= base - rbase;
>>             type->total_size -= base - rbase;
>>             memblock_insert_region(type, i, rbase, base - rbase,
>>                            memblock_get_region_node(rgn));
>>         } else if (rend > end) {
>>             /*
>>              * @rgn intersects from above.  Split and redo the
>>              * current region - the new bottom half.
>>              */
>>             rgn->base = end;
>>             rgn->size -= end - rbase;
>>             type->total_size -= end - rbase;
>>             memblock_insert_region(type, i--, rbase, end - rbase,
>>                            memblock_get_region_node(rgn));
>> =====================
>>
>> If the region is partial overlapped region, we will split the old
>> region into
>> two regions. After doing this, it is full contained region now.
> 
> You are right, I misunderstand the codes.
> 
>>
>> Thanks
>> Wen Congyang
>>
>>> Regards,
>>> Jaegeuk
>>>
>>>> Thanks
>>>> Wen Congyang
>>>>> Regards,
>>>>> Jaegeuk
>>>>>
>>>>>> CC: David Rientjes <rientjes@google.com>
>>>>>> CC: Jiang Liu <liuj97@gmail.com>
>>>>>> CC: Len Brown <len.brown@intel.com>
>>>>>> CC: Christoph Lameter <cl@linux.com>
>>>>>> Cc: Minchan Kim <minchan.kim@gmail.com>
>>>>>> CC: Andrew Morton <akpm@linux-foundation.org>
>>>>>> CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>>>>>> CC: Wen Congyang <wency@cn.fujitsu.com>
>>>>>> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>>>>>> ---
>>>>>>     mm/memory_hotplug.c | 13 ++++++++-----
>>>>>>     1 file changed, 8 insertions(+), 5 deletions(-)
>>>>>>
>>>>>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>>>>>> index ca07433..66a79a7 100644
>>>>>> --- a/mm/memory_hotplug.c
>>>>>> +++ b/mm/memory_hotplug.c
>>>>>> @@ -286,11 +286,14 @@ static int __meminit __add_section(int nid,
>>>>>> struct zone *zone,
>>>>>>     #ifdef CONFIG_SPARSEMEM_VMEMMAP
>>>>>>     static int __remove_section(struct zone *zone, struct mem_section
>>>>>> *ms)
>>>>>>     {
>>>>>> -    /*
>>>>>> -     * XXX: Freeing memmap with vmemmap is not implement yet.
>>>>>> -     *      This should be removed later.
>>>>>> -     */
>>>>>> -    return -EBUSY;
>>>>>> +    int ret = -EINVAL;
>>>>>> +
>>>>>> +    if (!valid_section(ms))
>>>>>> +        return ret;
>>>>>> +
>>>>>> +    ret = unregister_memory_section(ms);
>>>>>> +
>>>>>> +    return ret;
>>>>>>     }
>>>>>>     #else
>>>>>>     static int __remove_section(struct zone *zone, struct mem_section
>>>>>> *ms)
>>>
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index ca07433..66a79a7 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -286,11 +286,14 @@  static int __meminit __add_section(int nid, struct zone *zone,
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
 static int __remove_section(struct zone *zone, struct mem_section *ms)
 {
-	/*
-	 * XXX: Freeing memmap with vmemmap is not implement yet.
-	 *      This should be removed later.
-	 */
-	return -EBUSY;
+	int ret = -EINVAL;
+
+	if (!valid_section(ms))
+		return ret;
+
+	ret = unregister_memory_section(ms);
+
+	return ret;
 }
 #else
 static int __remove_section(struct zone *zone, struct mem_section *ms)