diff mbox

memory: could we add extra input param for memory_region_init_io()?

Message ID CAJnKYQn5zfmEk1-+auZZapUkh-XMFL-prbM9ATCxsEAoQJ0dpg@mail.gmail.com
State New
Headers show

Commit Message

pingfan liu Aug. 14, 2012, 8:30 a.m. UTC
To make memoryRegion survive without the protection of qemu big lock,
we need to pin its based Object.
In current code, the type of mr->opaque are quite different.  Lots of
them are Object, but quite of them are not yet.

The most challenge for changing from memory_region_init_io(..., void
*opaque, ...) to memory_region_init_io(..., Object *opaque,...) is
such codes:
hw/ide/cmd646.c:
static void setup_cmd646_bar(PCIIDEState *d, int bus_num)
{
    IDEBus *bus = &d->bus[bus_num];
    CMD646BAR *bar = &d->cmd646_bar[bus_num];

    bar->bus = bus;
    bar->pci_dev = d;
    memory_region_init_io(&bar->cmd, &cmd646_cmd_ops, bar, "cmd646-cmd", 4);
    memory_region_init_io(&bar->data, &cmd646_data_ops, bar, "cmd646-data", 8);
}
If we passed in mr's based Object @d to substitute @bar, then we can
not pass the extra info @bus_num.

To solve such issue, introduce extra member "Object *base" for MemoryRegion.



Any comment?

Thanks,
pingfan

Comments

Avi Kivity Aug. 14, 2012, 10:49 a.m. UTC | #1
On 08/14/2012 11:30 AM, liu ping fan wrote:
> To make memoryRegion survive without the protection of qemu big lock,
> we need to pin its based Object.
> In current code, the type of mr->opaque are quite different.  Lots of
> them are Object, but quite of them are not yet.
> 
> The most challenge for changing from memory_region_init_io(..., void
> *opaque, ...) to memory_region_init_io(..., Object *opaque,...) is
> such codes:
> hw/ide/cmd646.c:
> static void setup_cmd646_bar(PCIIDEState *d, int bus_num)
> {
>     IDEBus *bus = &d->bus[bus_num];
>     CMD646BAR *bar = &d->cmd646_bar[bus_num];
> 
>     bar->bus = bus;
>     bar->pci_dev = d;
>     memory_region_init_io(&bar->cmd, &cmd646_cmd_ops, bar, "cmd646-cmd", 4);
>     memory_region_init_io(&bar->data, &cmd646_data_ops, bar, "cmd646-data", 8);
> }
> If we passed in mr's based Object @d to substitute @bar, then we can
> not pass the extra info @bus_num.
> 
> To solve such issue, introduce extra member "Object *base" for MemoryRegion.
> 
> diff --git a/memory.c b/memory.c
> index 643871b..afd5dea 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -931,6 +931,7 @@ static void memory_region_dispatch_write(MemoryRegion *mr,
> 
>  void memory_region_init_io(MemoryRegion *mr,
>                             const MemoryRegionOps *ops,
> +                           Object *base,
>                             void *opaque,
>                             const char *name,
>                             uint64_t size)
> @@ -938,6 +939,7 @@ void memory_region_init_io(MemoryRegion *mr,
>      memory_region_init(mr, name, size);
>      mr->ops = ops;
>      mr->opaque = opaque;
> +    mr->base = base;
>      mr->terminates = true;
>      mr->destructor = memory_region_destructor_iomem;
>      mr->ram_addr = ~(ram_addr_t)0;
> diff --git a/memory.h b/memory.h
> index bd1bbae..2746e70 100644
> --- a/memory.h
> +++ b/memory.h
> @@ -120,6 +120,7 @@ struct MemoryRegion {
>      /* All fields are private - violators will be prosecuted */
>      const MemoryRegionOps *ops;
>      void *opaque;
> +    Object *base;
>      MemoryRegion *parent;
>      Int128 size;
>      target_phys_addr_t addr;
> 
> 
> Any comment?
> 

I prefer that we convert the third parameter (opaque) to be an Object.
That is a huge change, but I think it will improve the code base overall.

Other options are:

1) add MemoryRegionOps::ref(MemoryRegion *) and ::unref(MemoryRegion *)

If NULL, these callbacks are ignored.  If not, they are called with the
MemoryRegion as a parameter.  Their responsibility is to derive the
Object from the MemoryRegion (through the opaque or using
container_of()) and ref or unref it respectively.

2) add Object *MemoryRegionOps::object(MemoryRegion *)

Similar; if NULL it is ignored, otherwise it is used to derive the
Object, which the memory core will ref and unref.

3) add bool MemoryRegionOps::opaque_is_object

Tells the memory core that it is safe to cast the opaque into an Object.

4) add memory_region_set_object(MemoryRegion *, Object *)

Like your proposal, but avoids adding an extra paramter and changing all
call sites.
pingfan liu Aug. 16, 2012, 3:22 a.m. UTC | #2
On Tue, Aug 14, 2012 at 6:49 PM, Avi Kivity <avi@redhat.com> wrote:
> On 08/14/2012 11:30 AM, liu ping fan wrote:
>> To make memoryRegion survive without the protection of qemu big lock,
>> we need to pin its based Object.
>> In current code, the type of mr->opaque are quite different.  Lots of
>> them are Object, but quite of them are not yet.
>>
>> The most challenge for changing from memory_region_init_io(..., void
>> *opaque, ...) to memory_region_init_io(..., Object *opaque,...) is
>> such codes:
>> hw/ide/cmd646.c:
>> static void setup_cmd646_bar(PCIIDEState *d, int bus_num)
>> {
>>     IDEBus *bus = &d->bus[bus_num];
>>     CMD646BAR *bar = &d->cmd646_bar[bus_num];
>>
>>     bar->bus = bus;
>>     bar->pci_dev = d;
>>     memory_region_init_io(&bar->cmd, &cmd646_cmd_ops, bar, "cmd646-cmd", 4);
>>     memory_region_init_io(&bar->data, &cmd646_data_ops, bar, "cmd646-data", 8);
>> }
>> If we passed in mr's based Object @d to substitute @bar, then we can
>> not pass the extra info @bus_num.
>>
>> To solve such issue, introduce extra member "Object *base" for MemoryRegion.
>>
>> diff --git a/memory.c b/memory.c
>> index 643871b..afd5dea 100644
>> --- a/memory.c
>> +++ b/memory.c
>> @@ -931,6 +931,7 @@ static void memory_region_dispatch_write(MemoryRegion *mr,
>>
>>  void memory_region_init_io(MemoryRegion *mr,
>>                             const MemoryRegionOps *ops,
>> +                           Object *base,
>>                             void *opaque,
>>                             const char *name,
>>                             uint64_t size)
>> @@ -938,6 +939,7 @@ void memory_region_init_io(MemoryRegion *mr,
>>      memory_region_init(mr, name, size);
>>      mr->ops = ops;
>>      mr->opaque = opaque;
>> +    mr->base = base;
>>      mr->terminates = true;
>>      mr->destructor = memory_region_destructor_iomem;
>>      mr->ram_addr = ~(ram_addr_t)0;
>> diff --git a/memory.h b/memory.h
>> index bd1bbae..2746e70 100644
>> --- a/memory.h
>> +++ b/memory.h
>> @@ -120,6 +120,7 @@ struct MemoryRegion {
>>      /* All fields are private - violators will be prosecuted */
>>      const MemoryRegionOps *ops;
>>      void *opaque;
>> +    Object *base;
>>      MemoryRegion *parent;
>>      Int128 size;
>>      target_phys_addr_t addr;
>>
>>
>> Any comment?
>>
>
> I prefer that we convert the third parameter (opaque) to be an Object.
> That is a huge change, but I think it will improve the code base overall.
>
Object may be many different opaque, and each has different
MemoryRegionOps. We need to pass in both object and opaque.
Maybe we can use Object's property to store the pair (mr, opaque),
then we can use mr as key to get opaque in mmio-dispatch, but the
property's query will hurt the performance.
Or define a new struct X {Object *base, void *opaque}, and pass it to
memory_region_init_io() to substitute "void *opaque" . Finally,
reclaim X in memory_region_destroy().


> Other options are:
>
> 1) add MemoryRegionOps::ref(MemoryRegion *) and ::unref(MemoryRegion *)
>
> If NULL, these callbacks are ignored.  If not, they are called with the
> MemoryRegion as a parameter.  Their responsibility is to derive the
> Object from the MemoryRegion (through the opaque or using
> container_of()) and ref or unref it respectively.
>
> 2) add Object *MemoryRegionOps::object(MemoryRegion *)
>
> Similar; if NULL it is ignored, otherwise it is used to derive the
> Object, which the memory core will ref and unref.
>
> 3) add bool MemoryRegionOps::opaque_is_object
>
> Tells the memory core that it is safe to cast the opaque into an Object.
>
Above methods, the  process of derive the Object will be hard, we can
not tell opaque is Object or not without something like try&catch

> 4) add memory_region_set_object(MemoryRegion *, Object *)
>
> Like your proposal, but avoids adding an extra paramter and changing all
> call sites.
>
Yeah, this seems the easy one.

Thanks and regards,
pingfan
> --
> error compiling committee.c: too many arguments to function
Avi Kivity Aug. 16, 2012, 9:23 a.m. UTC | #3
On 08/16/2012 06:22 AM, liu ping fan wrote:
> On Tue, Aug 14, 2012 at 6:49 PM, Avi Kivity <avi@redhat.com> wrote:
>> On 08/14/2012 11:30 AM, liu ping fan wrote:
>>> To make memoryRegion survive without the protection of qemu big lock,
>>> we need to pin its based Object.
>>> In current code, the type of mr->opaque are quite different.  Lots of
>>> them are Object, but quite of them are not yet.
>>>
>>> The most challenge for changing from memory_region_init_io(..., void
>>> *opaque, ...) to memory_region_init_io(..., Object *opaque,...) is
>>> such codes:
>>> hw/ide/cmd646.c:
>>> static void setup_cmd646_bar(PCIIDEState *d, int bus_num)
>>> {
>>>     IDEBus *bus = &d->bus[bus_num];
>>>     CMD646BAR *bar = &d->cmd646_bar[bus_num];
>>>
>>>     bar->bus = bus;
>>>     bar->pci_dev = d;
>>>     memory_region_init_io(&bar->cmd, &cmd646_cmd_ops, bar, "cmd646-cmd", 4);
>>>     memory_region_init_io(&bar->data, &cmd646_data_ops, bar, "cmd646-data", 8);
>>> }
>>> If we passed in mr's based Object @d to substitute @bar, then we can
>>> not pass the extra info @bus_num.
>>>
>>> To solve such issue, introduce extra member "Object *base" for MemoryRegion.
>>>
>>> diff --git a/memory.c b/memory.c
>>> index 643871b..afd5dea 100644
>>> --- a/memory.c
>>> +++ b/memory.c
>>> @@ -931,6 +931,7 @@ static void memory_region_dispatch_write(MemoryRegion *mr,
>>>
>>>  void memory_region_init_io(MemoryRegion *mr,
>>>                             const MemoryRegionOps *ops,
>>> +                           Object *base,
>>>                             void *opaque,
>>>                             const char *name,
>>>                             uint64_t size)
>>> @@ -938,6 +939,7 @@ void memory_region_init_io(MemoryRegion *mr,
>>>      memory_region_init(mr, name, size);
>>>      mr->ops = ops;
>>>      mr->opaque = opaque;
>>> +    mr->base = base;
>>>      mr->terminates = true;
>>>      mr->destructor = memory_region_destructor_iomem;
>>>      mr->ram_addr = ~(ram_addr_t)0;
>>> diff --git a/memory.h b/memory.h
>>> index bd1bbae..2746e70 100644
>>> --- a/memory.h
>>> +++ b/memory.h
>>> @@ -120,6 +120,7 @@ struct MemoryRegion {
>>>      /* All fields are private - violators will be prosecuted */
>>>      const MemoryRegionOps *ops;
>>>      void *opaque;
>>> +    Object *base;
>>>      MemoryRegion *parent;
>>>      Int128 size;
>>>      target_phys_addr_t addr;
>>>
>>>
>>> Any comment?
>>>
>>
>> I prefer that we convert the third parameter (opaque) to be an Object.
>> That is a huge change, but I think it will improve the code base overall.
>>
> Object may be many different opaque, and each has different
> MemoryRegionOps. We need to pass in both object and opaque.

Why?  Usually there's a 1:1 mapping between object and opaque.  Can you
show cases where there isn't?

> Maybe we can use Object's property to store the pair (mr, opaque),
> then we can use mr as key to get opaque in mmio-dispatch, but the
> property's query will hurt the performance.
> Or define a new struct X {Object *base, void *opaque}, and pass it to
> memory_region_init_io() to substitute "void *opaque" . Finally,
> reclaim X in memory_region_destroy().

Usually the access callback can just cast the object to the real type.
That's all that's needed.

> 
> 
>> Other options are:
>>
>> 1) add MemoryRegionOps::ref(MemoryRegion *) and ::unref(MemoryRegion *)
>>
>> If NULL, these callbacks are ignored.  If not, they are called with the
>> MemoryRegion as a parameter.  Their responsibility is to derive the
>> Object from the MemoryRegion (through the opaque or using
>> container_of()) and ref or unref it respectively.
>>
>> 2) add Object *MemoryRegionOps::object(MemoryRegion *)
>>
>> Similar; if NULL it is ignored, otherwise it is used to derive the
>> Object, which the memory core will ref and unref.
>>
>> 3) add bool MemoryRegionOps::opaque_is_object
>>
>> Tells the memory core that it is safe to cast the opaque into an Object.
>>
> Above methods, the  process of derive the Object will be hard, we can
> not tell opaque is Object or not without something like try&catch

Take for example e1000.  It passes E1000State as the opaque, which is a
PCIDevice, which is a DeviceState, which is an Object.  So for that
device, nothing needs to be done.

> 
>> 4) add memory_region_set_object(MemoryRegion *, Object *)
>>
>> Like your proposal, but avoids adding an extra paramter and changing all
>> call sites.
>>
> Yeah, this seems the easy one.

Easy but wrong, IMO.
pingfan liu Aug. 17, 2012, 2:52 a.m. UTC | #4
On Thu, Aug 16, 2012 at 5:23 PM, Avi Kivity <avi@redhat.com> wrote:
> On 08/16/2012 06:22 AM, liu ping fan wrote:
>> On Tue, Aug 14, 2012 at 6:49 PM, Avi Kivity <avi@redhat.com> wrote:
>>> On 08/14/2012 11:30 AM, liu ping fan wrote:
>>>> To make memoryRegion survive without the protection of qemu big lock,
>>>> we need to pin its based Object.
>>>> In current code, the type of mr->opaque are quite different.  Lots of
>>>> them are Object, but quite of them are not yet.
>>>>
>>>> The most challenge for changing from memory_region_init_io(..., void
>>>> *opaque, ...) to memory_region_init_io(..., Object *opaque,...) is
>>>> such codes:
>>>> hw/ide/cmd646.c:
>>>> static void setup_cmd646_bar(PCIIDEState *d, int bus_num)
>>>> {
>>>>     IDEBus *bus = &d->bus[bus_num];
>>>>     CMD646BAR *bar = &d->cmd646_bar[bus_num];
>>>>
>>>>     bar->bus = bus;
>>>>     bar->pci_dev = d;
>>>>     memory_region_init_io(&bar->cmd, &cmd646_cmd_ops, bar, "cmd646-cmd", 4);
>>>>     memory_region_init_io(&bar->data, &cmd646_data_ops, bar, "cmd646-data", 8);
>>>> }
>>>> If we passed in mr's based Object @d to substitute @bar, then we can
>>>> not pass the extra info @bus_num.
>>>>
>>>> To solve such issue, introduce extra member "Object *base" for MemoryRegion.
>>>>
>>>> diff --git a/memory.c b/memory.c
>>>> index 643871b..afd5dea 100644
>>>> --- a/memory.c
>>>> +++ b/memory.c
>>>> @@ -931,6 +931,7 @@ static void memory_region_dispatch_write(MemoryRegion *mr,
>>>>
>>>>  void memory_region_init_io(MemoryRegion *mr,
>>>>                             const MemoryRegionOps *ops,
>>>> +                           Object *base,
>>>>                             void *opaque,
>>>>                             const char *name,
>>>>                             uint64_t size)
>>>> @@ -938,6 +939,7 @@ void memory_region_init_io(MemoryRegion *mr,
>>>>      memory_region_init(mr, name, size);
>>>>      mr->ops = ops;
>>>>      mr->opaque = opaque;
>>>> +    mr->base = base;
>>>>      mr->terminates = true;
>>>>      mr->destructor = memory_region_destructor_iomem;
>>>>      mr->ram_addr = ~(ram_addr_t)0;
>>>> diff --git a/memory.h b/memory.h
>>>> index bd1bbae..2746e70 100644
>>>> --- a/memory.h
>>>> +++ b/memory.h
>>>> @@ -120,6 +120,7 @@ struct MemoryRegion {
>>>>      /* All fields are private - violators will be prosecuted */
>>>>      const MemoryRegionOps *ops;
>>>>      void *opaque;
>>>> +    Object *base;
>>>>      MemoryRegion *parent;
>>>>      Int128 size;
>>>>      target_phys_addr_t addr;
>>>>
>>>>
>>>> Any comment?
>>>>
>>>
>>> I prefer that we convert the third parameter (opaque) to be an Object.
>>> That is a huge change, but I think it will improve the code base overall.
>>>
>> Object may be many different opaque, and each has different
>> MemoryRegionOps. We need to pass in both object and opaque.
>
> Why?  Usually there's a 1:1 mapping between object and opaque.  Can you
> show cases where there isn't?
>
As mentioned ahead, setup_cmd646_bar(PCIIDEState *d, int bus_num),
Object=@d, but opaque are
d->cmd646_bar[bus_num], so that is 1:n mapping. And when I browsing
the code, this is the main issue prevent us to transfer from void* to
Object* for memory_region_init_io()

>> Maybe we can use Object's property to store the pair (mr, opaque),
>> then we can use mr as key to get opaque in mmio-dispatch, but the
>> property's query will hurt the performance.
>> Or define a new struct X {Object *base, void *opaque}, and pass it to
>> memory_region_init_io() to substitute "void *opaque" . Finally,
>> reclaim X in memory_region_destroy().
>
> Usually the access callback can just cast the object to the real type.
> That's all that's needed.
>
OK, I see
>>
>>
>>> Other options are:
>>>
>>> 1) add MemoryRegionOps::ref(MemoryRegion *) and ::unref(MemoryRegion *)
>>>
>>> If NULL, these callbacks are ignored.  If not, they are called with the
>>> MemoryRegion as a parameter.  Their responsibility is to derive the
>>> Object from the MemoryRegion (through the opaque or using
>>> container_of()) and ref or unref it respectively.
>>>
>>> 2) add Object *MemoryRegionOps::object(MemoryRegion *)
>>>
>>> Similar; if NULL it is ignored, otherwise it is used to derive the
>>> Object, which the memory core will ref and unref.
>>>
>>> 3) add bool MemoryRegionOps::opaque_is_object
>>>
>>> Tells the memory core that it is safe to cast the opaque into an Object.
>>>
>> Above methods, the  process of derive the Object will be hard, we can
>> not tell opaque is Object or not without something like try&catch
>
> Take for example e1000.  It passes E1000State as the opaque, which is a
> PCIDevice, which is a DeviceState, which is an Object.  So for that
> device, nothing needs to be done.
>
The same example, in setup_cmd646_bar(PCIIDEState *d, int bus_num),  I
think we can not decide which is the type for @bar.  If using
object_dynamic_cast(@bar, TYPE_OBJECT) to tell whether it is Object or
not, it will raise exception.

>>
>>> 4) add memory_region_set_object(MemoryRegion *, Object *)
>>>
>>> Like your proposal, but avoids adding an extra paramter and changing all
>>> call sites.
>>>
>> Yeah, this seems the easy one.
>
> Easy but wrong, IMO.

Yeah, I am trying to avoid to do such things, and still try to find
another way out.

Thanx, pingfan
>
> --
> error compiling committee.c: too many arguments to function
pingfan liu Aug. 17, 2012, 7:41 a.m. UTC | #5
On Fri, Aug 17, 2012 at 10:52 AM, liu ping fan <qemulist@gmail.com> wrote:
> On Thu, Aug 16, 2012 at 5:23 PM, Avi Kivity <avi@redhat.com> wrote:
>> On 08/16/2012 06:22 AM, liu ping fan wrote:
>>> On Tue, Aug 14, 2012 at 6:49 PM, Avi Kivity <avi@redhat.com> wrote:
>>>> On 08/14/2012 11:30 AM, liu ping fan wrote:
>>>>> To make memoryRegion survive without the protection of qemu big lock,
>>>>> we need to pin its based Object.
>>>>> In current code, the type of mr->opaque are quite different.  Lots of
>>>>> them are Object, but quite of them are not yet.
>>>>>
>>>>> The most challenge for changing from memory_region_init_io(..., void
>>>>> *opaque, ...) to memory_region_init_io(..., Object *opaque,...) is
>>>>> such codes:
>>>>> hw/ide/cmd646.c:
>>>>> static void setup_cmd646_bar(PCIIDEState *d, int bus_num)
>>>>> {
>>>>>     IDEBus *bus = &d->bus[bus_num];
>>>>>     CMD646BAR *bar = &d->cmd646_bar[bus_num];
>>>>>
>>>>>     bar->bus = bus;
>>>>>     bar->pci_dev = d;
>>>>>     memory_region_init_io(&bar->cmd, &cmd646_cmd_ops, bar, "cmd646-cmd", 4);
>>>>>     memory_region_init_io(&bar->data, &cmd646_data_ops, bar, "cmd646-data", 8);
>>>>> }
>>>>> If we passed in mr's based Object @d to substitute @bar, then we can
>>>>> not pass the extra info @bus_num.
>>>>>
>>>>> To solve such issue, introduce extra member "Object *base" for MemoryRegion.
>>>>>
>>>>> diff --git a/memory.c b/memory.c
>>>>> index 643871b..afd5dea 100644
>>>>> --- a/memory.c
>>>>> +++ b/memory.c
>>>>> @@ -931,6 +931,7 @@ static void memory_region_dispatch_write(MemoryRegion *mr,
>>>>>
>>>>>  void memory_region_init_io(MemoryRegion *mr,
>>>>>                             const MemoryRegionOps *ops,
>>>>> +                           Object *base,
>>>>>                             void *opaque,
>>>>>                             const char *name,
>>>>>                             uint64_t size)
>>>>> @@ -938,6 +939,7 @@ void memory_region_init_io(MemoryRegion *mr,
>>>>>      memory_region_init(mr, name, size);
>>>>>      mr->ops = ops;
>>>>>      mr->opaque = opaque;
>>>>> +    mr->base = base;
>>>>>      mr->terminates = true;
>>>>>      mr->destructor = memory_region_destructor_iomem;
>>>>>      mr->ram_addr = ~(ram_addr_t)0;
>>>>> diff --git a/memory.h b/memory.h
>>>>> index bd1bbae..2746e70 100644
>>>>> --- a/memory.h
>>>>> +++ b/memory.h
>>>>> @@ -120,6 +120,7 @@ struct MemoryRegion {
>>>>>      /* All fields are private - violators will be prosecuted */
>>>>>      const MemoryRegionOps *ops;
>>>>>      void *opaque;
>>>>> +    Object *base;
>>>>>      MemoryRegion *parent;
>>>>>      Int128 size;
>>>>>      target_phys_addr_t addr;
>>>>>
>>>>>
>>>>> Any comment?
>>>>>
>>>>
>>>> I prefer that we convert the third parameter (opaque) to be an Object.
>>>> That is a huge change, but I think it will improve the code base overall.
>>>>
>>> Object may be many different opaque, and each has different
>>> MemoryRegionOps. We need to pass in both object and opaque.
>>
>> Why?  Usually there's a 1:1 mapping between object and opaque.  Can you
>> show cases where there isn't?
>>
> As mentioned ahead, setup_cmd646_bar(PCIIDEState *d, int bus_num),
> Object=@d, but opaque are
> d->cmd646_bar[bus_num], so that is 1:n mapping. And when I browsing
> the code, this is the main issue prevent us to transfer from void* to
> Object* for memory_region_init_io()
>
>>> Maybe we can use Object's property to store the pair (mr, opaque),
>>> then we can use mr as key to get opaque in mmio-dispatch, but the
>>> property's query will hurt the performance.
>>> Or define a new struct X {Object *base, void *opaque}, and pass it to
>>> memory_region_init_io() to substitute "void *opaque" . Finally,
>>> reclaim X in memory_region_destroy().
>>
>> Usually the access callback can just cast the object to the real type.
>> That's all that's needed.
>>
> OK, I see
>>>
>>>
>>>> Other options are:
>>>>
>>>> 1) add MemoryRegionOps::ref(MemoryRegion *) and ::unref(MemoryRegion *)
>>>>
>>>> If NULL, these callbacks are ignored.  If not, they are called with the
>>>> MemoryRegion as a parameter.  Their responsibility is to derive the
>>>> Object from the MemoryRegion (through the opaque or using
>>>> container_of()) and ref or unref it respectively.
>>>>
>>>> 2) add Object *MemoryRegionOps::object(MemoryRegion *)
>>>>
>>>> Similar; if NULL it is ignored, otherwise it is used to derive the
>>>> Object, which the memory core will ref and unref.
>>>>
>>>> 3) add bool MemoryRegionOps::opaque_is_object
>>>>
>>>> Tells the memory core that it is safe to cast the opaque into an Object.
>>>>
>>> Above methods, the  process of derive the Object will be hard, we can
>>> not tell opaque is Object or not without something like try&catch
>>
>> Take for example e1000.  It passes E1000State as the opaque, which is a
>> PCIDevice, which is a DeviceState, which is an Object.  So for that
>> device, nothing needs to be done.
>>
> The same example, in setup_cmd646_bar(PCIIDEState *d, int bus_num),  I
> think we can not decide which is the type for @bar.  If using
> object_dynamic_cast(@bar, TYPE_OBJECT) to tell whether it is Object or
> not, it will raise exception.
>
And something like omap_mpu_timer_init() in file hw/omap1.c , the
opaque(omap_mpu_timer_s) is got from g_malloc0, which makes things
even harder to handle.  And the DO_CAST can not work for such issue.

Regards, pingfan
>>>
>>>> 4) add memory_region_set_object(MemoryRegion *, Object *)
>>>>
>>>> Like your proposal, but avoids adding an extra paramter and changing all
>>>> call sites.
>>>>
>>> Yeah, this seems the easy one.
>>
>> Easy but wrong, IMO.
>
> Yeah, I am trying to avoid to do such things, and still try to find
> another way out.
>
> Thanx, pingfan
>>
>> --
>> error compiling committee.c: too many arguments to function
Avi Kivity Aug. 19, 2012, 10:10 a.m. UTC | #6
On 08/17/2012 05:52 AM, liu ping fan wrote:
>>
>> Why?  Usually there's a 1:1 mapping between object and opaque.  Can you
>> show cases where there isn't?
>>
> As mentioned ahead, setup_cmd646_bar(PCIIDEState *d, int bus_num),
> Object=@d, but opaque are
> d->cmd646_bar[bus_num], so that is 1:n mapping. And when I browsing
> the code, this is the main issue prevent us to transfer from void* to
> Object* for memory_region_init_io()

In this case there is a 1:1 relationship between CMD646BAR and IDEBus.
IDEBus is a BusState which is an Object.  So this case can be solved easily.

>>> Above methods, the  process of derive the Object will be hard, we can
>>> not tell opaque is Object or not without something like try&catch
>>
>> Take for example e1000.  It passes E1000State as the opaque, which is a
>> PCIDevice, which is a DeviceState, which is an Object.  So for that
>> device, nothing needs to be done.
>>
> The same example, in setup_cmd646_bar(PCIIDEState *d, int bus_num),  I
> think we can not decide which is the type for @bar.  If using
> object_dynamic_cast(@bar, TYPE_OBJECT) to tell whether it is Object or
> not, it will raise exception.

No, dynamic cast cannot work on an arbitrary void *.

There is only one legitimate case IMO where we don't naturally have an
object to work on - device assignment, where all the BARs are
equivalent.  In that case we can just make the BARs also Objects.
Everything else should work naturally with perhaps a little refactoring.
Avi Kivity Aug. 19, 2012, 10:12 a.m. UTC | #7
On 08/17/2012 10:41 AM, liu ping fan wrote:

>> The same example, in setup_cmd646_bar(PCIIDEState *d, int bus_num),  I
>> think we can not decide which is the type for @bar.  If using
>> object_dynamic_cast(@bar, TYPE_OBJECT) to tell whether it is Object or
>> not, it will raise exception.
>>
> And something like omap_mpu_timer_init() in file hw/omap1.c , the
> opaque(omap_mpu_timer_s) is got from g_malloc0, which makes things
> even harder to handle.  And the DO_CAST can not work for such issue.

IMO omap_mpu_timer_s should be a DeviceState.  Peter?
Peter Maydell Aug. 19, 2012, 11:23 a.m. UTC | #8
On 19 August 2012 11:12, Avi Kivity <avi@redhat.com> wrote:
> On 08/17/2012 10:41 AM, liu ping fan wrote:
>> And something like omap_mpu_timer_init() in file hw/omap1.c , the
>> opaque(omap_mpu_timer_s) is got from g_malloc0, which makes things
>> even harder to handle.  And the DO_CAST can not work for such issue.
>
> IMO omap_mpu_timer_s should be a DeviceState.  Peter?

Ideally, yes, but qemu is full of devices that haven't yet made the leap
to QOM.

omap1 is particularly tricky because I don't actually have any test images
for it, so refactoring it is a leap in the dark. [I've tried asking on this list
if anybody had test images a couple of times without success.]

-- PMM
Avi Kivity Aug. 19, 2012, 11:36 a.m. UTC | #9
On 08/19/2012 02:23 PM, Peter Maydell wrote:
> On 19 August 2012 11:12, Avi Kivity <avi@redhat.com> wrote:
>> On 08/17/2012 10:41 AM, liu ping fan wrote:
>>> And something like omap_mpu_timer_init() in file hw/omap1.c , the
>>> opaque(omap_mpu_timer_s) is got from g_malloc0, which makes things
>>> even harder to handle.  And the DO_CAST can not work for such issue.
>>
>> IMO omap_mpu_timer_s should be a DeviceState.  Peter?
> 
> Ideally, yes, but qemu is full of devices that haven't yet made the leap
> to QOM.
> 
> omap1 is particularly tricky because I don't actually have any test images
> for it, so refactoring it is a leap in the dark. [I've tried asking on this list
> if anybody had test images a couple of times without success.]

Okay.  Most of the options in this thread don't involve wholesale
conversion of the opaque parameter in memory_region_init_io() to type
Object *, so it's not necessary to convert everything.
Peter Maydell Aug. 19, 2012, 12:03 p.m. UTC | #10
On 19 August 2012 12:23, Peter Maydell <peter.maydell@linaro.org> wrote:
> Ideally, yes, but qemu is full of devices that haven't yet made the leap
> to QOM.
>
> omap1 is particularly tricky because I don't actually have any test images
> for it, so refactoring it is a leap in the dark. [I've tried asking on this list
> if anybody had test images a couple of times without success.]

As an aside, maybe we should just delete the omap1 code if we haven't
got any way of testing it?

-- PMM
Blue Swirl Aug. 19, 2012, 1:05 p.m. UTC | #11
On Sun, Aug 19, 2012 at 12:03 PM, Peter Maydell
<peter.maydell@linaro.org> wrote:
> On 19 August 2012 12:23, Peter Maydell <peter.maydell@linaro.org> wrote:
>> Ideally, yes, but qemu is full of devices that haven't yet made the leap
>> to QOM.
>>
>> omap1 is particularly tricky because I don't actually have any test images
>> for it, so refactoring it is a leap in the dark. [I've tried asking on this list
>> if anybody had test images a couple of times without success.]
>
> As an aside, maybe we should just delete the omap1 code if we haven't
> got any way of testing it?

Yes, for example start printing warnings when the machine is started
to solicit new maintainers. We should apply the same logic elsewhere
too.

>
> -- PMM
>
pingfan liu Aug. 21, 2012, 4:48 a.m. UTC | #12
On Sun, Aug 19, 2012 at 7:36 PM, Avi Kivity <avi@redhat.com> wrote:
> On 08/19/2012 02:23 PM, Peter Maydell wrote:
>> On 19 August 2012 11:12, Avi Kivity <avi@redhat.com> wrote:
>>> On 08/17/2012 10:41 AM, liu ping fan wrote:
>>>> And something like omap_mpu_timer_init() in file hw/omap1.c , the
>>>> opaque(omap_mpu_timer_s) is got from g_malloc0, which makes things
>>>> even harder to handle.  And the DO_CAST can not work for such issue.
>>>
>>> IMO omap_mpu_timer_s should be a DeviceState.  Peter?
>>
>> Ideally, yes, but qemu is full of devices that haven't yet made the leap
>> to QOM.
>>
>> omap1 is particularly tricky because I don't actually have any test images
>> for it, so refactoring it is a leap in the dark. [I've tried asking on this list
>> if anybody had test images a couple of times without success.]
>
> Okay.  Most of the options in this thread don't involve wholesale
> conversion of the opaque parameter in memory_region_init_io() to type
> Object *, so it's not necessary to convert everything.
>
But I think if the mr belongs to mem address space, it can not avoid
object_ref(Object *opaque) in mmio-dispatch code. ( If we use extra
flag to indicate whether the opaque is Object or not, then we lose the
meaning to transfer opaque to Object type, and maybe
memory_region_set_object() is a better choice).  And the only exempt
is that mr belong to io address space.

After carefully reading the code, I think
1. we assume Object &opaque have 1:1 relationship, if not, the Object
can be  refactored,
   which means children of Object (composite kid device or bus) will
be introduced. This
   may cause the modeling problem(but currently, can not find the example)

2. Some MemoryRegionOps are shared by different type of Device.
    For example, cirrus_vga_mem_ops will handle CirrusVGAState, and
the related Object
    can be ISACirrusVGAState or PCICirrusVGAState. So we need to remodel the
    CirrusVGAState as composite device of its parent,  and introducing
    qdev_create_in_place() just like qbus_create_in_place().

3. For the case, where opaque has no Object to relate with, simply
embeded Object at
    the head of them. (And there are lots of such issue in current code)

4. For pio, currently, we can ignore them.





> --
> error compiling committee.c: too many arguments to function
Avi Kivity Aug. 21, 2012, 8:57 a.m. UTC | #13
On 08/21/2012 07:48 AM, liu ping fan wrote:
> On Sun, Aug 19, 2012 at 7:36 PM, Avi Kivity <avi@redhat.com> wrote:
>> On 08/19/2012 02:23 PM, Peter Maydell wrote:
>>> On 19 August 2012 11:12, Avi Kivity <avi@redhat.com> wrote:
>>>> On 08/17/2012 10:41 AM, liu ping fan wrote:
>>>>> And something like omap_mpu_timer_init() in file hw/omap1.c , the
>>>>> opaque(omap_mpu_timer_s) is got from g_malloc0, which makes things
>>>>> even harder to handle.  And the DO_CAST can not work for such issue.
>>>>
>>>> IMO omap_mpu_timer_s should be a DeviceState.  Peter?
>>>
>>> Ideally, yes, but qemu is full of devices that haven't yet made the leap
>>> to QOM.
>>>
>>> omap1 is particularly tricky because I don't actually have any test images
>>> for it, so refactoring it is a leap in the dark. [I've tried asking on this list
>>> if anybody had test images a couple of times without success.]
>>
>> Okay.  Most of the options in this thread don't involve wholesale
>> conversion of the opaque parameter in memory_region_init_io() to type
>> Object *, so it's not necessary to convert everything.
>>
> But I think if the mr belongs to mem address space, it can not avoid
> object_ref(Object *opaque) in mmio-dispatch code. ( If we use extra
> flag to indicate whether the opaque is Object or not, then we lose the
> meaning to transfer opaque to Object type, and maybe
> memory_region_set_object() is a better choice).  

I'm not sure I understand.  What do you mean by "ability to transfer
opaque to Object type"?

However, I think I agree that we have to object_ref() during mmio
dispatch.  At that point, there are no locks taken, so nothing gurantees
the opaque is valid.

We can't take the big qemu lock since we're holding the mem map lock,
which nests inside it.  One thing we can do is retry the walk with the
qemu lock held, but that's hardly nice.  Or we can use a trylock.

If we add memory_region_set_object() that is no help either, since we
have to convert every call site.  If we do that, I prefer that we change
opaque to an Object.


> And the only exempt
> is that mr belong to io address space.

Eventually, I'd like the io address space to be dispatched by the same
code, but for now we can ignore it.

> 
> After carefully reading the code, I think
> 1. we assume Object &opaque have 1:1 relationship, if not, the Object
> can be  refactored,
>    which means children of Object (composite kid device or bus) will
> be introduced. This
>    may cause the modeling problem(but currently, can not find the example)

Agree.  The best example is device assignment, where BARs will need to
turn into Objects.  It means that an assigned device can disappear while
one of its BARs remain, so the code will have to handle this case.

> 
> 2. Some MemoryRegionOps are shared by different type of Device.
>     For example, cirrus_vga_mem_ops will handle CirrusVGAState, and
> the related Object
>     can be ISACirrusVGAState or PCICirrusVGAState. So we need to remodel the
>     CirrusVGAState as composite device of its parent,  and introducing
>     qdev_create_in_place() just like qbus_create_in_place().
> 

Alternatively, VGACommonState can be made into an Object, as in option 3.

> 3. For the case, where opaque has no Object to relate with, simply
> embeded Object at
>     the head of them. (And there are lots of such issue in current code)
> 
> 4. For pio, currently, we can ignore them.

Yes.

Unfortunately, requiring a 1:1 opaque:Object relationship means huge
churn in the code.  But I don't think it can be avoided, even with
memory_region_set_object().
pingfan liu Aug. 21, 2012, 9:25 a.m. UTC | #14
On Tue, Aug 21, 2012 at 4:57 PM, Avi Kivity <avi@redhat.com> wrote:
> On 08/21/2012 07:48 AM, liu ping fan wrote:
>> On Sun, Aug 19, 2012 at 7:36 PM, Avi Kivity <avi@redhat.com> wrote:
>>> On 08/19/2012 02:23 PM, Peter Maydell wrote:
>>>> On 19 August 2012 11:12, Avi Kivity <avi@redhat.com> wrote:
>>>>> On 08/17/2012 10:41 AM, liu ping fan wrote:
>>>>>> And something like omap_mpu_timer_init() in file hw/omap1.c , the
>>>>>> opaque(omap_mpu_timer_s) is got from g_malloc0, which makes things
>>>>>> even harder to handle.  And the DO_CAST can not work for such issue.
>>>>>
>>>>> IMO omap_mpu_timer_s should be a DeviceState.  Peter?
>>>>
>>>> Ideally, yes, but qemu is full of devices that haven't yet made the leap
>>>> to QOM.
>>>>
>>>> omap1 is particularly tricky because I don't actually have any test images
>>>> for it, so refactoring it is a leap in the dark. [I've tried asking on this list
>>>> if anybody had test images a couple of times without success.]
>>>
>>> Okay.  Most of the options in this thread don't involve wholesale
>>> conversion of the opaque parameter in memory_region_init_io() to type
>>> Object *, so it's not necessary to convert everything.
>>>
>> But I think if the mr belongs to mem address space, it can not avoid
>> object_ref(Object *opaque) in mmio-dispatch code. ( If we use extra
>> flag to indicate whether the opaque is Object or not, then we lose the
>> meaning to transfer opaque to Object type, and maybe
>> memory_region_set_object() is a better choice).
>
> I'm not sure I understand.  What do you mean by "ability to transfer
> opaque to Object type"?
>
Maybe I misunderstand your meaning of "Okay.  Most of the options in
this thread don't involve wholesale ..., so it's not necessary to
convert everything"
I think you mean that "omap_mpu_timer_s" need NOT to convert.  But as
it will also take the code path which has object_ref(Object*), so it
has to convert, otherwise the code will corrupt.
That is what I want to express.

> However, I think I agree that we have to object_ref() during mmio
> dispatch.  At that point, there are no locks taken, so nothing gurantees
> the opaque is valid.
>
> We can't take the big qemu lock since we're holding the mem map lock,
> which nests inside it.  One thing we can do is retry the walk with the
> qemu lock held, but that's hardly nice.  Or we can use a trylock.
>
> If we add memory_region_set_object() that is no help either, since we
> have to convert every call site.  If we do that, I prefer that we change
> opaque to an Object.
>
>
>> And the only exempt
>> is that mr belong to io address space.
>
> Eventually, I'd like the io address space to be dispatched by the same
> code, but for now we can ignore it.
>
OK
>>
>> After carefully reading the code, I think
>> 1. we assume Object &opaque have 1:1 relationship, if not, the Object
>> can be  refactored,
>>    which means children of Object (composite kid device or bus) will
>> be introduced. This
>>    may cause the modeling problem(but currently, can not find the example)
>
> Agree.  The best example is device assignment, where BARs will need to
> turn into Objects.  It means that an assigned device can disappear while
> one of its BARs remain, so the code will have to handle this case.
>
How about the initialization of BAR inc ref of assigned device; and
finalization of BAR dec it?
>>
>> 2. Some MemoryRegionOps are shared by different type of Device.
>>     For example, cirrus_vga_mem_ops will handle CirrusVGAState, and
>> the related Object
>>     can be ISACirrusVGAState or PCICirrusVGAState. So we need to remodel the
>>     CirrusVGAState as composite device of its parent,  and introducing
>>     qdev_create_in_place() just like qbus_create_in_place().
>>
>
> Alternatively, VGACommonState can be made into an Object, as in option 3.
>
NO, as for case 2, I think the composite device will hold ref of its
parent. Otherwise, we can not prevent  the PCICirrusVGAState from
disappearing when mmio-dispatch.  But as for case 3, it just work
around in current step. And usually, the Object is isolated. Maybe in
future, we can fix them.

>> 3. For the case, where opaque has no Object to relate with, simply
>> embeded Object at
>>     the head of them. (And there are lots of such issue in current code)
>>
>> 4. For pio, currently, we can ignore them.
>
> Yes.
>
> Unfortunately, requiring a 1:1 opaque:Object relationship means huge
> churn in the code.  But I don't think it can be avoided, even with
> memory_region_set_object().
>
Yeah, and I am studying the different case to see how to resolve and
make plans about the huge conversion.

Regards,
pingfan
> --
> error compiling committee.c: too many arguments to function
Avi Kivity Aug. 21, 2012, 10:13 a.m. UTC | #15
On 08/21/2012 12:25 PM, liu ping fan wrote:
> On Tue, Aug 21, 2012 at 4:57 PM, Avi Kivity <avi@redhat.com> wrote:
>> On 08/21/2012 07:48 AM, liu ping fan wrote:
>>> On Sun, Aug 19, 2012 at 7:36 PM, Avi Kivity <avi@redhat.com> wrote:
>>>> On 08/19/2012 02:23 PM, Peter Maydell wrote:
>>>>> On 19 August 2012 11:12, Avi Kivity <avi@redhat.com> wrote:
>>>>>> On 08/17/2012 10:41 AM, liu ping fan wrote:
>>>>>>> And something like omap_mpu_timer_init() in file hw/omap1.c , the
>>>>>>> opaque(omap_mpu_timer_s) is got from g_malloc0, which makes things
>>>>>>> even harder to handle.  And the DO_CAST can not work for such issue.
>>>>>>
>>>>>> IMO omap_mpu_timer_s should be a DeviceState.  Peter?
>>>>>
>>>>> Ideally, yes, but qemu is full of devices that haven't yet made the leap
>>>>> to QOM.
>>>>>
>>>>> omap1 is particularly tricky because I don't actually have any test images
>>>>> for it, so refactoring it is a leap in the dark. [I've tried asking on this list
>>>>> if anybody had test images a couple of times without success.]
>>>>
>>>> Okay.  Most of the options in this thread don't involve wholesale
>>>> conversion of the opaque parameter in memory_region_init_io() to type
>>>> Object *, so it's not necessary to convert everything.
>>>>
>>> But I think if the mr belongs to mem address space, it can not avoid
>>> object_ref(Object *opaque) in mmio-dispatch code. ( If we use extra
>>> flag to indicate whether the opaque is Object or not, then we lose the
>>> meaning to transfer opaque to Object type, and maybe
>>> memory_region_set_object() is a better choice).
>>
>> I'm not sure I understand.  What do you mean by "ability to transfer
>> opaque to Object type"?
>>
> Maybe I misunderstand your meaning of "Okay.  Most of the options in
> this thread don't involve wholesale ..., so it's not necessary to
> convert everything"
> I think you mean that "omap_mpu_timer_s" need NOT to convert.  

That is what I meant.

> But as
> it will also take the code path which has object_ref(Object*), so it
> has to convert, otherwise the code will corrupt.
> That is what I want to express.

Option 2, for example, had MemoryRegionOps::object(MemoryRegion *) which
can be used to convert the opaque to an Object, or return NULL.  With
that option, there would be no corruption:

  qemu_mutex_lock(&mem_map_lock);
  MemoryRegionSection mrs = walk(&phys_map);
  Object *object = mrs.mr->ops->object(mrs.mr);
  if (object) {
      object_ref(object);
  }
  qemu_mutex_unlock(&mem_map_unlock);

So there is no memory corruption.  However, as I point out below, we
also lack the reference count.  Maybe we could do something like:

  qemu_mutex_lock(&mem_map_lock);
  MemoryRegionSection mrs = walk(&phys_map);
  Object *object = mrs.mr->ops->object(mrs.mr);
  if (object) {
      object_ref(object);
  } else {
      goto retry_with_qemu_lock_held;
  }
  qemu_mutex_unlock(&mem_map_unlock);

But that's very inelegant.

>>
>> Agree.  The best example is device assignment, where BARs will need to
>> turn into Objects.  It means that an assigned device can disappear while
>> one of its BARs remain, so the code will have to handle this case.
>>
> How about the initialization of BAR inc ref of assigned device; and
> finalization of BAR dec it?

If we can avoid a cycle.  Won't the device need to hold refs to the BAR?
>> Unfortunately, requiring a 1:1 opaque:Object relationship means huge
>> churn in the code.  But I don't think it can be avoided, even with
>> memory_region_set_object().
>>
> Yeah, and I am studying the different case to see how to resolve and
> make plans about the huge conversion.

Ok.  One one hand, I think the conversion is unavoidable, and is also
beneficial: I never liked opaques.

On the other hand, those conversions are very tedious, introduce
regressions, and delay the result.

Anthony, any insight on this?
pingfan liu Aug. 21, 2012, 11:18 a.m. UTC | #16
On Tue, Aug 21, 2012 at 6:13 PM, Avi Kivity <avi@redhat.com> wrote:
> On 08/21/2012 12:25 PM, liu ping fan wrote:
>> On Tue, Aug 21, 2012 at 4:57 PM, Avi Kivity <avi@redhat.com> wrote:
>>> On 08/21/2012 07:48 AM, liu ping fan wrote:
>>>> On Sun, Aug 19, 2012 at 7:36 PM, Avi Kivity <avi@redhat.com> wrote:
>>>>> On 08/19/2012 02:23 PM, Peter Maydell wrote:
>>>>>> On 19 August 2012 11:12, Avi Kivity <avi@redhat.com> wrote:
>>>>>>> On 08/17/2012 10:41 AM, liu ping fan wrote:
>>>>>>>> And something like omap_mpu_timer_init() in file hw/omap1.c , the
>>>>>>>> opaque(omap_mpu_timer_s) is got from g_malloc0, which makes things
>>>>>>>> even harder to handle.  And the DO_CAST can not work for such issue.
>>>>>>>
>>>>>>> IMO omap_mpu_timer_s should be a DeviceState.  Peter?
>>>>>>
>>>>>> Ideally, yes, but qemu is full of devices that haven't yet made the leap
>>>>>> to QOM.
>>>>>>
>>>>>> omap1 is particularly tricky because I don't actually have any test images
>>>>>> for it, so refactoring it is a leap in the dark. [I've tried asking on this list
>>>>>> if anybody had test images a couple of times without success.]
>>>>>
>>>>> Okay.  Most of the options in this thread don't involve wholesale
>>>>> conversion of the opaque parameter in memory_region_init_io() to type
>>>>> Object *, so it's not necessary to convert everything.
>>>>>
>>>> But I think if the mr belongs to mem address space, it can not avoid
>>>> object_ref(Object *opaque) in mmio-dispatch code. ( If we use extra
>>>> flag to indicate whether the opaque is Object or not, then we lose the
>>>> meaning to transfer opaque to Object type, and maybe
>>>> memory_region_set_object() is a better choice).
>>>
>>> I'm not sure I understand.  What do you mean by "ability to transfer
>>> opaque to Object type"?
>>>
>> Maybe I misunderstand your meaning of "Okay.  Most of the options in
>> this thread don't involve wholesale ..., so it's not necessary to
>> convert everything"
>> I think you mean that "omap_mpu_timer_s" need NOT to convert.
>
> That is what I meant.
>
>> But as
>> it will also take the code path which has object_ref(Object*), so it
>> has to convert, otherwise the code will corrupt.
>> That is what I want to express.
>
> Option 2, for example, had MemoryRegionOps::object(MemoryRegion *) which
> can be used to convert the opaque to an Object, or return NULL.  With
> that option, there would be no corruption:
>
If we did not pass extra flag to indicate whether opaque is Object or
not, we can not implement MemoryRegionOps::object().  Because we lack
of something like "try,catch", and object_dynamic_cast() can not work.
So besides memory_region_init_io(), we need extra interface to pass that flag?

>   qemu_mutex_lock(&mem_map_lock);
>   MemoryRegionSection mrs = walk(&phys_map);
>   Object *object = mrs.mr->ops->object(mrs.mr);
>   if (object) {
>       object_ref(object);
>   }
>   qemu_mutex_unlock(&mem_map_unlock);
>
> So there is no memory corruption.  However, as I point out below, we
> also lack the reference count.  Maybe we could do something like:
>
I think the hotplug issue is only for limited type of bus. And
fortunately, for pci, they have already adopt Object.
So for other SoC, maybe we can ignore this issue at current step

>   qemu_mutex_lock(&mem_map_lock);
>   MemoryRegionSection mrs = walk(&phys_map);
>   Object *object = mrs.mr->ops->object(mrs.mr);
>   if (object) {
>       object_ref(object);
>   } else {
>       goto retry_with_qemu_lock_held;
>   }
>   qemu_mutex_unlock(&mem_map_unlock);
>
> But that's very inelegant.
>
>>>
>>> Agree.  The best example is device assignment, where BARs will need to
>>> turn into Objects.  It means that an assigned device can disappear while
>>> one of its BARs remain, so the code will have to handle this case.
>>>
>> How about the initialization of BAR inc ref of assigned device; and
>> finalization of BAR dec it?
>
> If we can avoid a cycle.  Won't the device need to hold refs to the BAR?

I will check the code, but I has thought BAR is implemented by
assigned device?  Will study closely.

Thanks and regards,
pingfan

>>> Unfortunately, requiring a 1:1 opaque:Object relationship means huge
>>> churn in the code.  But I don't think it can be avoided, even with
>>> memory_region_set_object().
>>>
>> Yeah, and I am studying the different case to see how to resolve and
>> make plans about the huge conversion.
>
> Ok.  One one hand, I think the conversion is unavoidable, and is also
> beneficial: I never liked opaques.
>
> On the other hand, those conversions are very tedious, introduce
> regressions, and delay the result.
>
> Anthony, any insight on this?
>
> --
> error compiling committee.c: too many arguments to function
Avi Kivity Aug. 21, 2012, 12:41 p.m. UTC | #17
On 08/21/2012 02:18 PM, liu ping fan wrote:
>>
>>> But as
>>> it will also take the code path which has object_ref(Object*), so it
>>> has to convert, otherwise the code will corrupt.
>>> That is what I want to express.
>>
>> Option 2, for example, had MemoryRegionOps::object(MemoryRegion *) which
>> can be used to convert the opaque to an Object, or return NULL.  With
>> that option, there would be no corruption:
>>
> If we did not pass extra flag to indicate whether opaque is Object or
> not, we can not implement MemoryRegionOps::object().  Because we lack
> of something like "try,catch", and object_dynamic_cast() can not work.
> So besides memory_region_init_io(), we need extra interface to pass that flag?

The interface is MemoryRegionOps::object().

If the user does not implement it, then the opaque cannot be converted
into an object.  If the user did implement it, then the function
specifies how to convert it.

For example:


static Object *e1000_mmio_object(void *opaque)
{
    E1000State *s = opaque;

    return &s->dev.qdev.parent_obj;
}

static const MemoryRegionOps e1000_mmio_ops = {
    .read = e1000_mmio_read,
    .write = e1000_mmio_write,
    .endianness = DEVICE_LITTLE_ENDIAN,
    .object = e1000_mmio_object,
    .impl = {
        .min_access_size = 4,
        .max_access_size = 4,
    },
};

>>   qemu_mutex_lock(&mem_map_lock);
>>   MemoryRegionSection mrs = walk(&phys_map);
>>   Object *object = mrs.mr->ops->object(mrs.mr);
>>   if (object) {
>>       object_ref(object);
>>   }
>>   qemu_mutex_unlock(&mem_map_unlock);
>>

This should read:


    Object *object = NULL;
    if (mrs.mr->ops->object) {
        object = mrs.mr->ops->object(mrs.mr);
    }


>> So there is no memory corruption.  However, as I point out below, we
>> also lack the reference count.  Maybe we could do something like:
>>
> I think the hotplug issue is only for limited type of bus. And
> fortunately, for pci, they have already adopt Object.
> So for other SoC, maybe we can ignore this issue at current step

We still have to take the big qemu lock somehow.

>> If we can avoid a cycle.  Won't the device need to hold refs to the BAR?
> 
> I will check the code, but I has thought BAR is implemented by
> assigned device?  

Yes.
pingfan liu Aug. 24, 2012, 9:47 a.m. UTC | #18
On Tue, Aug 21, 2012 at 8:41 PM, Avi Kivity <avi@redhat.com> wrote:
> On 08/21/2012 02:18 PM, liu ping fan wrote:
>>>
>>>> But as
>>>> it will also take the code path which has object_ref(Object*), so it
>>>> has to convert, otherwise the code will corrupt.
>>>> That is what I want to express.
>>>
>>> Option 2, for example, had MemoryRegionOps::object(MemoryRegion *) which
>>> can be used to convert the opaque to an Object, or return NULL.  With
>>> that option, there would be no corruption:
>>>
>> If we did not pass extra flag to indicate whether opaque is Object or
>> not, we can not implement MemoryRegionOps::object().  Because we lack
>> of something like "try,catch", and object_dynamic_cast() can not work.
>> So besides memory_region_init_io(), we need extra interface to pass that flag?
>
> The interface is MemoryRegionOps::object().
>
> If the user does not implement it, then the opaque cannot be converted
> into an object.  If the user did implement it, then the function
> specifies how to convert it.
>
> For example:
>
>
> static Object *e1000_mmio_object(void *opaque)
> {
>     E1000State *s = opaque;
>
>     return &s->dev.qdev.parent_obj;
> }
>
> static const MemoryRegionOps e1000_mmio_ops = {
>     .read = e1000_mmio_read,
>     .write = e1000_mmio_write,
>     .endianness = DEVICE_LITTLE_ENDIAN,
>     .object = e1000_mmio_object,
>     .impl = {
>         .min_access_size = 4,
>         .max_access_size = 4,
>     },
> };
>
>>>   qemu_mutex_lock(&mem_map_lock);
>>>   MemoryRegionSection mrs = walk(&phys_map);
>>>   Object *object = mrs.mr->ops->object(mrs.mr);
>>>   if (object) {
>>>       object_ref(object);
>>>   }
>>>   qemu_mutex_unlock(&mem_map_unlock);
>>>
>
> This should read:
>
>
>     Object *object = NULL;
>     if (mrs.mr->ops->object) {

Oh, thanks. I have not realize it at the first glance.  Thank you for
the great patient.
During these days, when I try to verify the convert,  I run into the
issue similar as you point out for "assigned device can disappear
while one of its BARs remain".   For example,
typedef struct PCIIDEState {
    PCIDevice dev;
    IDEBus bus[2]; ==> lies in parent obj,
..........................
}
It is an create-in-place issue,
void ide_bus_new(IDEBus *idebus, DeviceState *dev, int bus_id)
{
    qbus_create_inplace(&idebus->qbus, TYPE_IDE_BUS, dev, NULL);
    idebus->bus_id = bus_id;
}

When I try to fix such issue, I got another patchset "Subject: [PATCH
0/10] rework on hot unplug",  in a short word, unplug will break the
refer circular, so we can have back ref from IDEBus to PCIIDEState
will CC you.

Thanks and regards,
pingfan

>         object = mrs.mr->ops->object(mrs.mr);
>     }
>
>
>>> So there is no memory corruption.  However, as I point out below, we
>>> also lack the reference count.  Maybe we could do something like:
>>>
>> I think the hotplug issue is only for limited type of bus. And
>> fortunately, for pci, they have already adopt Object.
>> So for other SoC, maybe we can ignore this issue at current step
>
> We still have to take the big qemu lock somehow.
>
>>> If we can avoid a cycle.  Won't the device need to hold refs to the BAR?
>>
>> I will check the code, but I has thought BAR is implemented by
>> assigned device?
>
> Yes.
>
>
> --
> error compiling committee.c: too many arguments to function
diff mbox

Patch

diff --git a/memory.c b/memory.c
index 643871b..afd5dea 100644
--- a/memory.c
+++ b/memory.c
@@ -931,6 +931,7 @@  static void memory_region_dispatch_write(MemoryRegion *mr,

 void memory_region_init_io(MemoryRegion *mr,
                            const MemoryRegionOps *ops,
+                           Object *base,
                            void *opaque,
                            const char *name,
                            uint64_t size)
@@ -938,6 +939,7 @@  void memory_region_init_io(MemoryRegion *mr,
     memory_region_init(mr, name, size);
     mr->ops = ops;
     mr->opaque = opaque;
+    mr->base = base;
     mr->terminates = true;
     mr->destructor = memory_region_destructor_iomem;
     mr->ram_addr = ~(ram_addr_t)0;
diff --git a/memory.h b/memory.h
index bd1bbae..2746e70 100644
--- a/memory.h
+++ b/memory.h
@@ -120,6 +120,7 @@  struct MemoryRegion {
     /* All fields are private - violators will be prosecuted */
     const MemoryRegionOps *ops;
     void *opaque;
+    Object *base;
     MemoryRegion *parent;
     Int128 size;
     target_phys_addr_t addr;