Patchwork [05/10] KVM don't care about TLB handling

login
register
mail settings
Submitter Juan Quintela
Date Nov. 23, 2010, 11:03 p.m.
Message ID <8975f5409c231fba23ddc866248f33718cc50e22.1290552026.git.quintela@redhat.com>
Download mbox | patch
Permalink /patch/72773/
State New
Headers show

Comments

Juan Quintela - Nov. 23, 2010, 11:03 p.m.
From: Juan Quintela <quintela@trasno.org>

TLB handling is only used in TCG mode.  It is very costly for guests with lots
of memory ad lots of CPU's.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@trasno.org>
---
 exec.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)
Anthony Liguori - Nov. 30, 2010, 2:04 a.m.
On 11/23/2010 05:03 PM, Juan Quintela wrote:
> From: Juan Quintela<quintela@trasno.org>
>
> TLB handling is only used in TCG mode.  It is very costly for guests with lots
> of memory ad lots of CPU's.
>
> Signed-off-by: Juan Quintela<quintela@redhat.com>
> Signed-off-by: Juan Quintela<quintela@trasno.org>
> ---
>   exec.c |    3 +++
>   1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/exec.c b/exec.c
> index db9ff55..f5b2386 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -2028,6 +2028,9 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
>           return;
>       cpu_physical_memory_mask_dirty_range(start, length, dirty_flags);
>
> +    if (kvm_enabled())
> +        return;
> +
>    

Not a bad idea, but this is the wrong approach.

Refactor the following code into another function 
tlb_reset_dirty_range_all() and then have a nop KVM implementation.  
Exits in the middle of functions are difficult to maintain long term 
whereas refactoring the function provides a more obvious idea about 
what's going on and why it's not needed for KVM (btw, a comment is 
definitely needed).

Regards,

Anthony Liguori

>       /* we modify the TLB cache so that the dirty bit will be set again
>          when accessing the range */
>       start1 = (unsigned long)qemu_get_ram_ptr(start);
>
Juan Quintela - Nov. 30, 2010, 2:09 a.m.
Anthony Liguori <anthony@codemonkey.ws> wrote:
> On 11/23/2010 05:03 PM, Juan Quintela wrote:
>> From: Juan Quintela<quintela@trasno.org>
>>
>> TLB handling is only used in TCG mode.  It is very costly for guests with lots
>> of memory ad lots of CPU's.
>>
>> Signed-off-by: Juan Quintela<quintela@redhat.com>
>> Signed-off-by: Juan Quintela<quintela@trasno.org>
>> ---
>>   exec.c |    3 +++
>>   1 files changed, 3 insertions(+), 0 deletions(-)
>>
>> diff --git a/exec.c b/exec.c
>> index db9ff55..f5b2386 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -2028,6 +2028,9 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
>>           return;
>>       cpu_physical_memory_mask_dirty_range(start, length, dirty_flags);
>>
>> +    if (kvm_enabled())
>> +        return;
>> +
>>    
>
> Not a bad idea, but this is the wrong approach.
>
> Refactor the following code into another function
> tlb_reset_dirty_range_all() and then have a nop KVM implementation.
> Exits in the middle of functions are difficult to maintain long term
> whereas refactoring the function provides a more obvious idea about
> what's going on and why it's not needed for KVM (btw, a comment is
> definitely needed).

I was heading here for something that was able to be backported for
stable.  I agree that "longer term" we need to split this code (how we
manage the bitmap is way too expensive for kvm and lots of RAM).

THanks, Juan.

> Regards,
>
> Anthony Liguori
>
>>       /* we modify the TLB cache so that the dirty bit will be set again
>>          when accessing the range */
>>       start1 = (unsigned long)qemu_get_ram_ptr(start);
>>
Anthony Liguori - Nov. 30, 2010, 2:24 a.m.
On 11/29/2010 08:09 PM, Juan Quintela wrote:
> Anthony Liguori<anthony@codemonkey.ws>  wrote:
>    
>> On 11/23/2010 05:03 PM, Juan Quintela wrote:
>>      
>>> From: Juan Quintela<quintela@trasno.org>
>>>
>>> TLB handling is only used in TCG mode.  It is very costly for guests with lots
>>> of memory ad lots of CPU's.
>>>
>>> Signed-off-by: Juan Quintela<quintela@redhat.com>
>>> Signed-off-by: Juan Quintela<quintela@trasno.org>
>>> ---
>>>    exec.c |    3 +++
>>>    1 files changed, 3 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/exec.c b/exec.c
>>> index db9ff55..f5b2386 100644
>>> --- a/exec.c
>>> +++ b/exec.c
>>> @@ -2028,6 +2028,9 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
>>>            return;
>>>        cpu_physical_memory_mask_dirty_range(start, length, dirty_flags);
>>>
>>> +    if (kvm_enabled())
>>> +        return;
>>> +
>>>
>>>        
>> Not a bad idea, but this is the wrong approach.
>>
>> Refactor the following code into another function
>> tlb_reset_dirty_range_all() and then have a nop KVM implementation.
>> Exits in the middle of functions are difficult to maintain long term
>> whereas refactoring the function provides a more obvious idea about
>> what's going on and why it's not needed for KVM (btw, a comment is
>> definitely needed).
>>      
> I was heading here for something that was able to be backported for
> stable.

Not a good consideration for master.

>    I agree that "longer term" we need to split this code (how we
> manage the bitmap is way too expensive for kvm and lots of RAM).
>    

That's fundamentally KVM's problem and that's where we should fix it.  
Otherwise, we're working around a broken interface forever in QEMU.

Regards,

Anthony Liguori

> THanks, Juan.
>
>    
>> Regards,
>>
>> Anthony Liguori
>>
>>      
>>>        /* we modify the TLB cache so that the dirty bit will be set again
>>>           when accessing the range */
>>>        start1 = (unsigned long)qemu_get_ram_ptr(start);
>>>
>>>

Patch

diff --git a/exec.c b/exec.c
index db9ff55..f5b2386 100644
--- a/exec.c
+++ b/exec.c
@@ -2028,6 +2028,9 @@  void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
         return;
     cpu_physical_memory_mask_dirty_range(start, length, dirty_flags);

+    if (kvm_enabled())
+        return;
+
     /* we modify the TLB cache so that the dirty bit will be set again
        when accessing the range */
     start1 = (unsigned long)qemu_get_ram_ptr(start);