diff mbox

[v2,8/8] spapr_pci: Use XICS interrupt allocator and do not cache interrupts in PHB

Message ID 53881E1E.7050009@ozlabs.ru
State New
Headers show

Commit Message

Alexey Kardashevskiy May 30, 2014, 5:58 a.m. UTC
On 05/28/2014 09:35 PM, Alexander Graf wrote:
> 
> On 28.05.14 03:18, Alexey Kardashevskiy wrote:
>> On 05/28/2014 10:41 AM, Alexander Graf wrote:
>>> On 28.05.14 02:34, Alexey Kardashevskiy wrote:
>>>> On 05/28/2014 09:55 AM, Alexander Graf wrote:
> 
> ...
> 
>>>>> How do I migrate GHashTable? If I am allowed to use custom and bit more
>>>>> polished get/put from "[PATCH 1/2] vmstate: Add helper to enable
>>>>> GHashTable
>>>>> migration", I'll be fine.
>>>>> Yeah, I think it's ok to be custom in this case. Or another crazy idea -
>>>>> could you flatten the hash table into an array of structs that you can
>>>>> describe using VMState? You could then convert from the flat array
>>>>> to/from
>>>>> the GHashTable with pre_load/post_load hooks.
>>>> Array is exactly what I am trying to get rid of. Ok, I'll remove
>>>> hashmap at
>>>> all and implement dynamic flat array (yay, yet another bicycle!).
>>> Huh? The array would only live during the migration. It would be size=0
>>> during normal execution, but in a pre_save hook we could make size =
>>> hash.length() and reuse the existing, working VMState infrastructure.
>> When would I free that array? What if I continue the source guest and then
>> migrate again?
> 
> Something like
> 
> void pre_save(...) {
>     free(s->array);
>     s->array_len = s->hash.number_of_keys();
>     s->array = g_malloc(s->array_len * sizeof(struct array_elem));
>     for (i = 0; i < s->array_len; i++) {
>         s->array[i].key = s->hash.key[i];
>         s->array[i].value = s->hash.value[i];
>     }
> }
> 
> That would waste a few bytes when we continue after migration, but it
> should at least keep that overhead to a minimum.


Ok. Fine. When do I allocate an array on the destination then? Remember, I
do not know the number of device being transferred in advance because of
PCI hotplug so I cannot guess. sPAPRPHBState::pre_load is too early - I do
not know the size yet.

I can:
1. transfer size separately as part of sPAPRPHBState
2. move this temporary array into a subsection
3. allocate array in the subsection's pre_load() in a hope that QEMU will
call subsection's pre_load() AFTER the size of array is transferred.

This is true for now but what if one day someone decides that all
pre_load() callbacks from all subsections must be called at once at the
beginning of the object migration? I am screwed then.



Oooor I can make a patch as below (did not test it at all, just an idea).
Basically define VMS_ALLOC flag which will allocate necessary amount of
memory for that thing.


I am definitely missing somewhere here, as usual, and there must be already
some cool hack which I do not see, so what is it?






>> I mean I can solve all of this for sure but duplicating data
>> just to make existing migration happy is bit weird. But - I'll do what you
>> say here, it is no big deal :)
> 
> I don't find the concept of duplicating data for migration too odd - it
> sounds like a good compromise between introspectability and abstraction. If
> you have a better suggestion I'm all open :).

Comments

Alexander Graf May 30, 2014, 8 a.m. UTC | #1
On 30.05.14 07:58, Alexey Kardashevskiy wrote:
> On 05/28/2014 09:35 PM, Alexander Graf wrote:
>> On 28.05.14 03:18, Alexey Kardashevskiy wrote:
>>> On 05/28/2014 10:41 AM, Alexander Graf wrote:
>>>> On 28.05.14 02:34, Alexey Kardashevskiy wrote:
>>>>> On 05/28/2014 09:55 AM, Alexander Graf wrote:
>> ...
>>
>>>>>> How do I migrate GHashTable? If I am allowed to use custom and bit more
>>>>>> polished get/put from "[PATCH 1/2] vmstate: Add helper to enable
>>>>>> GHashTable
>>>>>> migration", I'll be fine.
>>>>>> Yeah, I think it's ok to be custom in this case. Or another crazy idea -
>>>>>> could you flatten the hash table into an array of structs that you can
>>>>>> describe using VMState? You could then convert from the flat array
>>>>>> to/from
>>>>>> the GHashTable with pre_load/post_load hooks.
>>>>> Array is exactly what I am trying to get rid of. Ok, I'll remove
>>>>> hashmap at
>>>>> all and implement dynamic flat array (yay, yet another bicycle!).
>>>> Huh? The array would only live during the migration. It would be size=0
>>>> during normal execution, but in a pre_save hook we could make size =
>>>> hash.length() and reuse the existing, working VMState infrastructure.
>>> When would I free that array? What if I continue the source guest and then
>>> migrate again?
>> Something like
>>
>> void pre_save(...) {
>>      free(s->array);
>>      s->array_len = s->hash.number_of_keys();
>>      s->array = g_malloc(s->array_len * sizeof(struct array_elem));
>>      for (i = 0; i < s->array_len; i++) {
>>          s->array[i].key = s->hash.key[i];
>>          s->array[i].value = s->hash.value[i];
>>      }
>> }
>>
>> That would waste a few bytes when we continue after migration, but it
>> should at least keep that overhead to a minimum.
>
> Ok. Fine. When do I allocate an array on the destination then? Remember, I
> do not know the number of device being transferred in advance because of
> PCI hotplug so I cannot guess. sPAPRPHBState::pre_load is too early - I do
> not know the size yet.

Honestly I wouldn't try to make things so complicated. You have a 
maximum size of the hash key array - there can't be more keys than 
devfns, right? So if you just allocate the maximum size on pre_load and 
free it on post_load, you should be good.


>
> I can:
> 1. transfer size separately as part of sPAPRPHBState
> 2. move this temporary array into a subsection
> 3. allocate array in the subsection's pre_load() in a hope that QEMU will
> call subsection's pre_load() AFTER the size of array is transferred.
>
> This is true for now but what if one day someone decides that all
> pre_load() callbacks from all subsections must be called at once at the
> beginning of the object migration? I am screwed then.
>
>
>
> Oooor I can make a patch as below (did not test it at all, just an idea).
> Basically define VMS_ALLOC flag which will allocate necessary amount of
> memory for that thing.

I like the patch below too though :). I'm sure that something like this 
would come in handy in other spots as well.


Alex

>
>
> I am definitely missing somewhere here, as usual, and there must be already
> some cool hack which I do not see, so what is it?
>
>
>
> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> index 6af599d..7a14d26 100644
> --- a/include/migration/vmstate.h
> +++ b/include/migration/vmstate.h
> @@ -101,6 +101,7 @@ enum VMStateFlags {
>       VMS_VARRAY_UINT8     = 0x400,  /* Array with size in uint8_t field*/
>       VMS_VARRAY_UINT32    = 0x800,  /* Array with size in uint32_t field*/
>       VMS_MUST_EXIST       = 0x1000, /* Field must exist in input */
> +    VMS_ALLOC            = 0x2000, /* Alloc a buffer on the destination */
>   };
>
>   typedef struct {
> @@ -750,6 +751,16 @@ static const VMStateInfo vmstate_info_hash;
>       .offset     = vmstate_offset_value(_state, _field, qemu_hash),      \
>   }
>
> +#define VMSTATE_VARRAY_STRUCT_ALLOC(_field, _state, _field_num, _version,
> _info, _type) {\
> +    .name       = (stringify(_field)),                               \
> +    .version_id = (_version),                                        \
> +    .num_offset = vmstate_offset_value(_state, _field_num, int32_t), \
> +    .info       = &(_info),                                          \
> +    .size       = sizeof(_type),                                     \
> +    .flags      = VMS_VARRAY_INT32|VMS_POINTER|VMS_ALLOC,            \
> +    .offset     = vmstate_offset_pointer(_state, _field, _type),     \
> +}
> +
>   #define VMSTATE_UNUSED_V(_v, _size)                                   \
>       VMSTATE_UNUSED_BUFFER(NULL, _v, _size)
>
> diff --git a/vmstate.c b/vmstate.c
> index e1518da..7d6b0b9 100644
> --- a/vmstate.c
> +++ b/vmstate.c
> @@ -48,6 +48,10 @@ static void *vmstate_base_addr(void *opaque,
> VMStateField *field)
>       void *base_addr = opaque + field->offset;
>
>       if (field->flags & VMS_POINTER) {
> +        if (field->flags & VMS_ALLOC) {
> +            n_elems = vmstate_n_elems(opaque, field);
> +            *base_addr = g_malloc_n(n_elems, field->size);
> +        }
>           base_addr = *(void **)base_addr + field->start;
>       }
>
>
>
>>> I mean I can solve all of this for sure but duplicating data
>>> just to make existing migration happy is bit weird. But - I'll do what you
>>> say here, it is no big deal :)
>> I don't find the concept of duplicating data for migration too odd - it
>> sounds like a good compromise between introspectability and abstraction. If
>> you have a better suggestion I'm all open :).
>
>
>
Alexey Kardashevskiy May 30, 2014, 9:01 a.m. UTC | #2
On 05/30/2014 06:00 PM, Alexander Graf wrote:
> 
> On 30.05.14 07:58, Alexey Kardashevskiy wrote:
>> On 05/28/2014 09:35 PM, Alexander Graf wrote:
>>> On 28.05.14 03:18, Alexey Kardashevskiy wrote:
>>>> On 05/28/2014 10:41 AM, Alexander Graf wrote:
>>>>> On 28.05.14 02:34, Alexey Kardashevskiy wrote:
>>>>>> On 05/28/2014 09:55 AM, Alexander Graf wrote:
>>> ...
>>>
>>>>>>> How do I migrate GHashTable? If I am allowed to use custom and bit more
>>>>>>> polished get/put from "[PATCH 1/2] vmstate: Add helper to enable
>>>>>>> GHashTable
>>>>>>> migration", I'll be fine.
>>>>>>> Yeah, I think it's ok to be custom in this case. Or another crazy
>>>>>>> idea -
>>>>>>> could you flatten the hash table into an array of structs that you can
>>>>>>> describe using VMState? You could then convert from the flat array
>>>>>>> to/from
>>>>>>> the GHashTable with pre_load/post_load hooks.
>>>>>> Array is exactly what I am trying to get rid of. Ok, I'll remove
>>>>>> hashmap at
>>>>>> all and implement dynamic flat array (yay, yet another bicycle!).
>>>>> Huh? The array would only live during the migration. It would be size=0
>>>>> during normal execution, but in a pre_save hook we could make size =
>>>>> hash.length() and reuse the existing, working VMState infrastructure.
>>>> When would I free that array? What if I continue the source guest and then
>>>> migrate again?
>>> Something like
>>>
>>> void pre_save(...) {
>>>      free(s->array);
>>>      s->array_len = s->hash.number_of_keys();
>>>      s->array = g_malloc(s->array_len * sizeof(struct array_elem));
>>>      for (i = 0; i < s->array_len; i++) {
>>>          s->array[i].key = s->hash.key[i];
>>>          s->array[i].value = s->hash.value[i];
>>>      }
>>> }
>>>
>>> That would waste a few bytes when we continue after migration, but it
>>> should at least keep that overhead to a minimum.
>>
>> Ok. Fine. When do I allocate an array on the destination then? Remember, I
>> do not know the number of device being transferred in advance because of
>> PCI hotplug so I cannot guess. sPAPRPHBState::pre_load is too early - I do
>> not know the size yet.
> 
> Honestly I wouldn't try to make things so complicated. You have a maximum
> size of the hash key array - there can't be more keys than devfns, right?

bus-dev-fn which gives us 65536. Which is quite a lot. If it was just
dev-fn, I would keep a static array and that's it.


> So if you just allocate the maximum size on pre_load and free it on
> post_load, you should be good.




> 
>>
>> I can:
>> 1. transfer size separately as part of sPAPRPHBState
>> 2. move this temporary array into a subsection
>> 3. allocate array in the subsection's pre_load() in a hope that QEMU will
>> call subsection's pre_load() AFTER the size of array is transferred.
>>
>> This is true for now but what if one day someone decides that all
>> pre_load() callbacks from all subsections must be called at once at the
>> beginning of the object migration? I am screwed then.
>>
>>
>>
>> Oooor I can make a patch as below (did not test it at all, just an idea).
>> Basically define VMS_ALLOC flag which will allocate necessary amount of
>> memory for that thing.
> 
> I like the patch below too though :). I'm sure that something like this
> would come in handy in other spots as well.


I go for it now, I'll post the series later tonight.


> 
> Alex
> 
>>
>>
>> I am definitely missing somewhere here, as usual, and there must be already
>> some cool hack which I do not see, so what is it?
>>
>>
>>
>> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
>> index 6af599d..7a14d26 100644
>> --- a/include/migration/vmstate.h
>> +++ b/include/migration/vmstate.h
>> @@ -101,6 +101,7 @@ enum VMStateFlags {
>>       VMS_VARRAY_UINT8     = 0x400,  /* Array with size in uint8_t field*/
>>       VMS_VARRAY_UINT32    = 0x800,  /* Array with size in uint32_t field*/
>>       VMS_MUST_EXIST       = 0x1000, /* Field must exist in input */
>> +    VMS_ALLOC            = 0x2000, /* Alloc a buffer on the destination */
>>   };
>>
>>   typedef struct {
>> @@ -750,6 +751,16 @@ static const VMStateInfo vmstate_info_hash;
>>       .offset     = vmstate_offset_value(_state, _field, qemu_hash),      \
>>   }
>>
>> +#define VMSTATE_VARRAY_STRUCT_ALLOC(_field, _state, _field_num, _version,
>> _info, _type) {\
>> +    .name       = (stringify(_field)),                               \
>> +    .version_id = (_version),                                        \
>> +    .num_offset = vmstate_offset_value(_state, _field_num, int32_t), \
>> +    .info       = &(_info),                                          \
>> +    .size       = sizeof(_type),                                     \
>> +    .flags      = VMS_VARRAY_INT32|VMS_POINTER|VMS_ALLOC,            \
>> +    .offset     = vmstate_offset_pointer(_state, _field, _type),     \
>> +}
>> +
>>   #define VMSTATE_UNUSED_V(_v, _size)                                   \
>>       VMSTATE_UNUSED_BUFFER(NULL, _v, _size)
>>
>> diff --git a/vmstate.c b/vmstate.c
>> index e1518da..7d6b0b9 100644
>> --- a/vmstate.c
>> +++ b/vmstate.c
>> @@ -48,6 +48,10 @@ static void *vmstate_base_addr(void *opaque,
>> VMStateField *field)
>>       void *base_addr = opaque + field->offset;
>>
>>       if (field->flags & VMS_POINTER) {
>> +        if (field->flags & VMS_ALLOC) {
>> +            n_elems = vmstate_n_elems(opaque, field);
>> +            *base_addr = g_malloc_n(n_elems, field->size);
>> +        }
>>           base_addr = *(void **)base_addr + field->start;
>>       }
>>
>>
>>
>>>> I mean I can solve all of this for sure but duplicating data
>>>> just to make existing migration happy is bit weird. But - I'll do what you
>>>> say here, it is no big deal :)
>>> I don't find the concept of duplicating data for migration too odd - it
>>> sounds like a good compromise between introspectability and abstraction. If
>>> you have a better suggestion I'm all open :).
>>
>>
>>
>
diff mbox

Patch

diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index 6af599d..7a14d26 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -101,6 +101,7 @@  enum VMStateFlags {
     VMS_VARRAY_UINT8     = 0x400,  /* Array with size in uint8_t field*/
     VMS_VARRAY_UINT32    = 0x800,  /* Array with size in uint32_t field*/
     VMS_MUST_EXIST       = 0x1000, /* Field must exist in input */
+    VMS_ALLOC            = 0x2000, /* Alloc a buffer on the destination */
 };

 typedef struct {
@@ -750,6 +751,16 @@  static const VMStateInfo vmstate_info_hash;
     .offset     = vmstate_offset_value(_state, _field, qemu_hash),      \
 }

+#define VMSTATE_VARRAY_STRUCT_ALLOC(_field, _state, _field_num, _version,
_info, _type) {\
+    .name       = (stringify(_field)),                               \
+    .version_id = (_version),                                        \
+    .num_offset = vmstate_offset_value(_state, _field_num, int32_t), \
+    .info       = &(_info),                                          \
+    .size       = sizeof(_type),                                     \
+    .flags      = VMS_VARRAY_INT32|VMS_POINTER|VMS_ALLOC,            \
+    .offset     = vmstate_offset_pointer(_state, _field, _type),     \
+}
+
 #define VMSTATE_UNUSED_V(_v, _size)                                   \
     VMSTATE_UNUSED_BUFFER(NULL, _v, _size)

diff --git a/vmstate.c b/vmstate.c
index e1518da..7d6b0b9 100644
--- a/vmstate.c
+++ b/vmstate.c
@@ -48,6 +48,10 @@  static void *vmstate_base_addr(void *opaque,
VMStateField *field)
     void *base_addr = opaque + field->offset;

     if (field->flags & VMS_POINTER) {
+        if (field->flags & VMS_ALLOC) {
+            n_elems = vmstate_n_elems(opaque, field);
+            *base_addr = g_malloc_n(n_elems, field->size);
+        }
         base_addr = *(void **)base_addr + field->start;
     }