Patchwork [v4,13/16] e1000: add busy flag to anti broken device state

login
register
mail settings
Submitter pingfank@linux.vnet.ibm.com
Date Oct. 22, 2012, 9:23 a.m.
Message ID <1350897839-29593-14-git-send-email-pingfank@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/193215/
State New
Headers show

Comments

pingfank@linux.vnet.ibm.com - Oct. 22, 2012, 9:23 a.m.
The broken device state is caused by releasing local lock before acquiring
big lock. To fix this issue, we have two choice:
  1.use busy flag to protect the state
    The drawback is that we will introduce independent busy flag for each
    independent device's logic unit.
  2.reload the device's state
    The drawback is if the call chain is too deep, the action to reload will
    touch each layer. Also the reloading means to recaculate the intermediate
    result based on device's regs.

This patch adopt the solution 1 to fix the issue.

Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
---
 hw/e1000.c |   23 ++++++++++++++++++++---
 1 files changed, 20 insertions(+), 3 deletions(-)
pingfan liu - Oct. 23, 2012, 5:52 a.m.
On Mon, Oct 22, 2012 at 6:40 PM, Avi Kivity <avi@redhat.com> wrote:
> On 10/22/2012 11:23 AM, Liu Ping Fan wrote:
>> The broken device state is caused by releasing local lock before acquiring
>> big lock. To fix this issue, we have two choice:
>>   1.use busy flag to protect the state
>>     The drawback is that we will introduce independent busy flag for each
>>     independent device's logic unit.
>>   2.reload the device's state
>>     The drawback is if the call chain is too deep, the action to reload will
>>     touch each layer. Also the reloading means to recaculate the intermediate
>>     result based on device's regs.
>>
>> This patch adopt the solution 1 to fix the issue.
>
> Doesn't the nested mmio patch detect this?
>
It will only record and fix the issue on one thread. But guest can
touch the emulated device on muti-threads.
>
> --
> error compiling committee.c: too many arguments to function
>
Avi Kivity - Oct. 23, 2012, 9:06 a.m.
On 10/23/2012 07:52 AM, liu ping fan wrote:
> On Mon, Oct 22, 2012 at 6:40 PM, Avi Kivity <avi@redhat.com> wrote:
>> On 10/22/2012 11:23 AM, Liu Ping Fan wrote:
>>> The broken device state is caused by releasing local lock before acquiring
>>> big lock. To fix this issue, we have two choice:
>>>   1.use busy flag to protect the state
>>>     The drawback is that we will introduce independent busy flag for each
>>>     independent device's logic unit.
>>>   2.reload the device's state
>>>     The drawback is if the call chain is too deep, the action to reload will
>>>     touch each layer. Also the reloading means to recaculate the intermediate
>>>     result based on device's regs.
>>>
>>> This patch adopt the solution 1 to fix the issue.
>>
>> Doesn't the nested mmio patch detect this?
>>
> It will only record and fix the issue on one thread. But guest can
> touch the emulated device on muti-threads.

I forgot about that.

I propose that we merge without a fix.  Upstream is broken in the same
way; it won't deadlock but it will surely break in some other way if a
write can cause another write to be triggered to the same location.

When we gain more experience with fine-graining devices we can converge
on a good solution.
Jan Kiszka - Oct. 23, 2012, 9:07 a.m.
On 2012-10-23 07:52, liu ping fan wrote:
> On Mon, Oct 22, 2012 at 6:40 PM, Avi Kivity <avi@redhat.com> wrote:
>> On 10/22/2012 11:23 AM, Liu Ping Fan wrote:
>>> The broken device state is caused by releasing local lock before acquiring
>>> big lock. To fix this issue, we have two choice:
>>>   1.use busy flag to protect the state
>>>     The drawback is that we will introduce independent busy flag for each
>>>     independent device's logic unit.
>>>   2.reload the device's state
>>>     The drawback is if the call chain is too deep, the action to reload will
>>>     touch each layer. Also the reloading means to recaculate the intermediate
>>>     result based on device's regs.
>>>
>>> This patch adopt the solution 1 to fix the issue.
>>
>> Doesn't the nested mmio patch detect this?
>>
> It will only record and fix the issue on one thread. But guest can
> touch the emulated device on muti-threads.

Sorry, what does that mean? A second VCPU accessing the device will
simply be ignored when it races with another VCPU? Specifically

+    if (s->busy) {
+        return;

and

+    uint64_t ret = 0;
+
+    if (s->busy) {
+        return ret;

is worrying me.

Jan
pingfan liu - Oct. 23, 2012, 9:32 a.m.
On Tue, Oct 23, 2012 at 5:07 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> On 2012-10-23 07:52, liu ping fan wrote:
>> On Mon, Oct 22, 2012 at 6:40 PM, Avi Kivity <avi@redhat.com> wrote:
>>> On 10/22/2012 11:23 AM, Liu Ping Fan wrote:
>>>> The broken device state is caused by releasing local lock before acquiring
>>>> big lock. To fix this issue, we have two choice:
>>>>   1.use busy flag to protect the state
>>>>     The drawback is that we will introduce independent busy flag for each
>>>>     independent device's logic unit.
>>>>   2.reload the device's state
>>>>     The drawback is if the call chain is too deep, the action to reload will
>>>>     touch each layer. Also the reloading means to recaculate the intermediate
>>>>     result based on device's regs.
>>>>
>>>> This patch adopt the solution 1 to fix the issue.
>>>
>>> Doesn't the nested mmio patch detect this?
>>>
>> It will only record and fix the issue on one thread. But guest can
>> touch the emulated device on muti-threads.
>
> Sorry, what does that mean? A second VCPU accessing the device will
> simply be ignored when it races with another VCPU? Specifically
>
Yes, just ignored.  For device which support many logic in parallel,
it should use independent busy flag for each logic

Regards,
pingfan

> +    if (s->busy) {
> +        return;
>
> and
>
> +    uint64_t ret = 0;
> +
> +    if (s->busy) {
> +        return ret;
>
> is worrying me.
>
> Jan
>
> --
> Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
> Corporate Competence Center Embedded Linux
Avi Kivity - Oct. 23, 2012, 9:37 a.m.
On 10/23/2012 11:32 AM, liu ping fan wrote:
> On Tue, Oct 23, 2012 at 5:07 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>> On 2012-10-23 07:52, liu ping fan wrote:
>>> On Mon, Oct 22, 2012 at 6:40 PM, Avi Kivity <avi@redhat.com> wrote:
>>>> On 10/22/2012 11:23 AM, Liu Ping Fan wrote:
>>>>> The broken device state is caused by releasing local lock before acquiring
>>>>> big lock. To fix this issue, we have two choice:
>>>>>   1.use busy flag to protect the state
>>>>>     The drawback is that we will introduce independent busy flag for each
>>>>>     independent device's logic unit.
>>>>>   2.reload the device's state
>>>>>     The drawback is if the call chain is too deep, the action to reload will
>>>>>     touch each layer. Also the reloading means to recaculate the intermediate
>>>>>     result based on device's regs.
>>>>>
>>>>> This patch adopt the solution 1 to fix the issue.
>>>>
>>>> Doesn't the nested mmio patch detect this?
>>>>
>>> It will only record and fix the issue on one thread. But guest can
>>> touch the emulated device on muti-threads.
>>
>> Sorry, what does that mean? A second VCPU accessing the device will
>> simply be ignored when it races with another VCPU? Specifically
>>
> Yes, just ignored.  For device which support many logic in parallel,
> it should use independent busy flag for each logic

We don't actually know that e1000 doesn't.  Why won't writing into
different registers in parallel work?
pingfan liu - Oct. 24, 2012, 6:36 a.m.
On Tue, Oct 23, 2012 at 5:37 PM, Avi Kivity <avi@redhat.com> wrote:
> On 10/23/2012 11:32 AM, liu ping fan wrote:
>> On Tue, Oct 23, 2012 at 5:07 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>> On 2012-10-23 07:52, liu ping fan wrote:
>>>> On Mon, Oct 22, 2012 at 6:40 PM, Avi Kivity <avi@redhat.com> wrote:
>>>>> On 10/22/2012 11:23 AM, Liu Ping Fan wrote:
>>>>>> The broken device state is caused by releasing local lock before acquiring
>>>>>> big lock. To fix this issue, we have two choice:
>>>>>>   1.use busy flag to protect the state
>>>>>>     The drawback is that we will introduce independent busy flag for each
>>>>>>     independent device's logic unit.
>>>>>>   2.reload the device's state
>>>>>>     The drawback is if the call chain is too deep, the action to reload will
>>>>>>     touch each layer. Also the reloading means to recaculate the intermediate
>>>>>>     result based on device's regs.
>>>>>>
>>>>>> This patch adopt the solution 1 to fix the issue.
>>>>>
>>>>> Doesn't the nested mmio patch detect this?
>>>>>
>>>> It will only record and fix the issue on one thread. But guest can
>>>> touch the emulated device on muti-threads.
>>>
>>> Sorry, what does that mean? A second VCPU accessing the device will
>>> simply be ignored when it races with another VCPU? Specifically
>>>
>> Yes, just ignored.  For device which support many logic in parallel,
>> it should use independent busy flag for each logic
>
> We don't actually know that e1000 doesn't.  Why won't writing into
> different registers in parallel work?
>
I think e1000 has only one transfer logic, so one busy flag is enough.
And the normal guest's driver will access the registers one by one.
But anyway, it may have parallel modules.  So what about model it like
this
if busy:
  wait

clear busy:
   wakeup

Regards,
pingfan
>
> --
> error compiling committee.c: too many arguments to function
Avi Kivity - Oct. 25, 2012, 8:55 a.m.
On 10/24/2012 08:36 AM, liu ping fan wrote:
> On Tue, Oct 23, 2012 at 5:37 PM, Avi Kivity <avi@redhat.com> wrote:
>> On 10/23/2012 11:32 AM, liu ping fan wrote:
>>> On Tue, Oct 23, 2012 at 5:07 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>> On 2012-10-23 07:52, liu ping fan wrote:
>>>>> On Mon, Oct 22, 2012 at 6:40 PM, Avi Kivity <avi@redhat.com> wrote:
>>>>>> On 10/22/2012 11:23 AM, Liu Ping Fan wrote:
>>>>>>> The broken device state is caused by releasing local lock before acquiring
>>>>>>> big lock. To fix this issue, we have two choice:
>>>>>>>   1.use busy flag to protect the state
>>>>>>>     The drawback is that we will introduce independent busy flag for each
>>>>>>>     independent device's logic unit.
>>>>>>>   2.reload the device's state
>>>>>>>     The drawback is if the call chain is too deep, the action to reload will
>>>>>>>     touch each layer. Also the reloading means to recaculate the intermediate
>>>>>>>     result based on device's regs.
>>>>>>>
>>>>>>> This patch adopt the solution 1 to fix the issue.
>>>>>>
>>>>>> Doesn't the nested mmio patch detect this?
>>>>>>
>>>>> It will only record and fix the issue on one thread. But guest can
>>>>> touch the emulated device on muti-threads.
>>>>
>>>> Sorry, what does that mean? A second VCPU accessing the device will
>>>> simply be ignored when it races with another VCPU? Specifically
>>>>
>>> Yes, just ignored.  For device which support many logic in parallel,
>>> it should use independent busy flag for each logic
>>
>> We don't actually know that e1000 doesn't.  Why won't writing into
>> different registers in parallel work?
>>
> I think e1000 has only one transfer logic, so one busy flag is enough.
> And the normal guest's driver will access the registers one by one.
> But anyway, it may have parallel modules.  So what about model it like
> this
> if busy:
>   wait
> 
> clear busy:
>    wakeup
> 

You mean lock()/unlock()?

Again I suggest to ignore this issue for now.  We need to make progress
and we can't get everything perfect (or even agree on everything).  When
we have converted a few devices, we will have more information and can
think of a good solution.
Peter Maydell - Oct. 25, 2012, 9 a.m.
On 23 October 2012 10:37, Avi Kivity <avi@redhat.com> wrote:
> On 10/23/2012 11:32 AM, liu ping fan wrote:
>> On Tue, Oct 23, 2012 at 5:07 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>> On 2012-10-23 07:52, liu ping fan wrote:
>>>> On Mon, Oct 22, 2012 at 6:40 PM, Avi Kivity <avi@redhat.com> wrote:
>>>>> On 10/22/2012 11:23 AM, Liu Ping Fan wrote:
>>>> It will only record and fix the issue on one thread. But guest can
>>>> touch the emulated device on muti-threads.
>>>
>>> Sorry, what does that mean? A second VCPU accessing the device will
>>> simply be ignored when it races with another VCPU? Specifically
>>>
>> Yes, just ignored.  For device which support many logic in parallel,
>> it should use independent busy flag for each logic
>
> We don't actually know that e1000 doesn't.  Why won't writing into
> different registers in parallel work?

Unless the device we're emulating supports multiple in
parallel accesses (and I bet 99.9% of the devices we model
don't) then the memory framework needs to serialise the
loads/stores. Otherwise it's just going to be excessively
hard to write a reliable device model.

-- PMM
Avi Kivity - Oct. 25, 2012, 9:04 a.m.
On 10/25/2012 11:00 AM, Peter Maydell wrote:
> On 23 October 2012 10:37, Avi Kivity <avi@redhat.com> wrote:
>> On 10/23/2012 11:32 AM, liu ping fan wrote:
>>> On Tue, Oct 23, 2012 at 5:07 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>> On 2012-10-23 07:52, liu ping fan wrote:
>>>>> On Mon, Oct 22, 2012 at 6:40 PM, Avi Kivity <avi@redhat.com> wrote:
>>>>>> On 10/22/2012 11:23 AM, Liu Ping Fan wrote:
>>>>> It will only record and fix the issue on one thread. But guest can
>>>>> touch the emulated device on muti-threads.
>>>>
>>>> Sorry, what does that mean? A second VCPU accessing the device will
>>>> simply be ignored when it races with another VCPU? Specifically
>>>>
>>> Yes, just ignored.  For device which support many logic in parallel,
>>> it should use independent busy flag for each logic
>>
>> We don't actually know that e1000 doesn't.  Why won't writing into
>> different registers in parallel work?
> 
> Unless the device we're emulating supports multiple in
> parallel accesses (and I bet 99.9% of the devices we model
> don't) then the memory framework needs to serialise the
> loads/stores. Otherwise it's just going to be excessively
> hard to write a reliable device model.

That's why we have a per-device lock.  The busy flag breaks that model
by discarding accesses that occur in parallel.
pingfan liu - Oct. 26, 2012, 3:05 a.m.
On Thu, Oct 25, 2012 at 5:04 PM, Avi Kivity <avi@redhat.com> wrote:
> On 10/25/2012 11:00 AM, Peter Maydell wrote:
>> On 23 October 2012 10:37, Avi Kivity <avi@redhat.com> wrote:
>>> On 10/23/2012 11:32 AM, liu ping fan wrote:
>>>> On Tue, Oct 23, 2012 at 5:07 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>>> On 2012-10-23 07:52, liu ping fan wrote:
>>>>>> On Mon, Oct 22, 2012 at 6:40 PM, Avi Kivity <avi@redhat.com> wrote:
>>>>>>> On 10/22/2012 11:23 AM, Liu Ping Fan wrote:
>>>>>> It will only record and fix the issue on one thread. But guest can
>>>>>> touch the emulated device on muti-threads.
>>>>>
>>>>> Sorry, what does that mean? A second VCPU accessing the device will
>>>>> simply be ignored when it races with another VCPU? Specifically
>>>>>
>>>> Yes, just ignored.  For device which support many logic in parallel,
>>>> it should use independent busy flag for each logic
>>>
>>> We don't actually know that e1000 doesn't.  Why won't writing into
>>> different registers in parallel work?
>>
>> Unless the device we're emulating supports multiple in
>> parallel accesses (and I bet 99.9% of the devices we model
>> don't) then the memory framework needs to serialise the
>> loads/stores. Otherwise it's just going to be excessively
>> hard to write a reliable device model.
>
> That's why we have a per-device lock.  The busy flag breaks that model
> by discarding accesses that occur in parallel.
>
I think by adopting the model, we can avoid this.

struct device_logic {
  bool busy;
  qemu_mutex lock;
  QemuCond wait;
};

LOCK(logic->lock)
while (logic->busy) {
qemu_cond_wait(&logic->wait, &logic->lock);
}
....
do hardware emulation
...
logic->busy = false;
qemu_cond_signal(&logic->wait);

This is identical to the biglock's behavior for parallel access to
device for nowadays. And then, the problem left is what level for
parallel we want. If we expect more parallel, we need to degrade the
device into more logic unit.


Regards,
pingfan
>
> --
> error compiling committee.c: too many arguments to function
pingfan liu - Oct. 26, 2012, 3:08 a.m.
On Fri, Oct 26, 2012 at 11:05 AM, liu ping fan <qemulist@gmail.com> wrote:
> On Thu, Oct 25, 2012 at 5:04 PM, Avi Kivity <avi@redhat.com> wrote:
>> On 10/25/2012 11:00 AM, Peter Maydell wrote:
>>> On 23 October 2012 10:37, Avi Kivity <avi@redhat.com> wrote:
>>>> On 10/23/2012 11:32 AM, liu ping fan wrote:
>>>>> On Tue, Oct 23, 2012 at 5:07 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>>>> On 2012-10-23 07:52, liu ping fan wrote:
>>>>>>> On Mon, Oct 22, 2012 at 6:40 PM, Avi Kivity <avi@redhat.com> wrote:
>>>>>>>> On 10/22/2012 11:23 AM, Liu Ping Fan wrote:
>>>>>>> It will only record and fix the issue on one thread. But guest can
>>>>>>> touch the emulated device on muti-threads.
>>>>>>
>>>>>> Sorry, what does that mean? A second VCPU accessing the device will
>>>>>> simply be ignored when it races with another VCPU? Specifically
>>>>>>
>>>>> Yes, just ignored.  For device which support many logic in parallel,
>>>>> it should use independent busy flag for each logic
>>>>
>>>> We don't actually know that e1000 doesn't.  Why won't writing into
>>>> different registers in parallel work?
>>>
>>> Unless the device we're emulating supports multiple in
>>> parallel accesses (and I bet 99.9% of the devices we model
>>> don't) then the memory framework needs to serialise the
>>> loads/stores. Otherwise it's just going to be excessively
>>> hard to write a reliable device model.
>>
>> That's why we have a per-device lock.  The busy flag breaks that model
>> by discarding accesses that occur in parallel.
>>
> I think by adopting the model, we can avoid this.
>
> struct device_logic {
>   bool busy;
>   qemu_mutex lock;
>   QemuCond wait;
> };
>
> LOCK(logic->lock)
> while (logic->busy) {
> qemu_cond_wait(&logic->wait, &logic->lock);
> }
> ....
> do hardware emulation
> ...
> logic->busy = false;
UNLOCK(lock); <-------------------------------------- forget
> qemu_cond_signal(&logic->wait);
>
> This is identical to the biglock's behavior for parallel access to
> device for nowadays. And then, the problem left is what level for
> parallel we want. If we expect more parallel, we need to degrade the
> device into more logic unit.
>
>
> Regards,
> pingfan
>>
>> --
>> error compiling committee.c: too many arguments to function
Jan Kiszka - Oct. 26, 2012, 10:25 a.m.
On 2012-10-26 05:08, liu ping fan wrote:
> On Fri, Oct 26, 2012 at 11:05 AM, liu ping fan <qemulist@gmail.com> wrote:
>> On Thu, Oct 25, 2012 at 5:04 PM, Avi Kivity <avi@redhat.com> wrote:
>>> On 10/25/2012 11:00 AM, Peter Maydell wrote:
>>>> On 23 October 2012 10:37, Avi Kivity <avi@redhat.com> wrote:
>>>>> On 10/23/2012 11:32 AM, liu ping fan wrote:
>>>>>> On Tue, Oct 23, 2012 at 5:07 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>>>>> On 2012-10-23 07:52, liu ping fan wrote:
>>>>>>>> On Mon, Oct 22, 2012 at 6:40 PM, Avi Kivity <avi@redhat.com> wrote:
>>>>>>>>> On 10/22/2012 11:23 AM, Liu Ping Fan wrote:
>>>>>>>> It will only record and fix the issue on one thread. But guest can
>>>>>>>> touch the emulated device on muti-threads.
>>>>>>>
>>>>>>> Sorry, what does that mean? A second VCPU accessing the device will
>>>>>>> simply be ignored when it races with another VCPU? Specifically
>>>>>>>
>>>>>> Yes, just ignored.  For device which support many logic in parallel,
>>>>>> it should use independent busy flag for each logic
>>>>>
>>>>> We don't actually know that e1000 doesn't.  Why won't writing into
>>>>> different registers in parallel work?
>>>>
>>>> Unless the device we're emulating supports multiple in
>>>> parallel accesses (and I bet 99.9% of the devices we model
>>>> don't) then the memory framework needs to serialise the
>>>> loads/stores. Otherwise it's just going to be excessively
>>>> hard to write a reliable device model.
>>>
>>> That's why we have a per-device lock.  The busy flag breaks that model
>>> by discarding accesses that occur in parallel.
>>>
>> I think by adopting the model, we can avoid this.
>>
>> struct device_logic {
>>   bool busy;
>>   qemu_mutex lock;
>>   QemuCond wait;
>> };
>>
>> LOCK(logic->lock)
>> while (logic->busy) {
>> qemu_cond_wait(&logic->wait, &logic->lock);
>> }
>> ....
>> do hardware emulation
>> ...
>> logic->busy = false;
> UNLOCK(lock); <-------------------------------------- forget
>> qemu_cond_signal(&logic->wait);
>>
>> This is identical to the biglock's behavior for parallel access to
>> device for nowadays. And then, the problem left is what level for
>> parallel we want. If we expect more parallel, we need to degrade the
>> device into more logic unit.

But where is the remaining added-value of the busy flag then? Everyone
could just as well be serialized by the lock itself. And even when
dropping the lock while running the hw emulation, that doesn't change
anything to the semantic - and our ABBA problems I sketched yesterday.

Jan
pingfan liu - Oct. 29, 2012, 5:24 a.m.
On Fri, Oct 26, 2012 at 6:25 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> On 2012-10-26 05:08, liu ping fan wrote:
>> On Fri, Oct 26, 2012 at 11:05 AM, liu ping fan <qemulist@gmail.com> wrote:
>>> On Thu, Oct 25, 2012 at 5:04 PM, Avi Kivity <avi@redhat.com> wrote:
>>>> On 10/25/2012 11:00 AM, Peter Maydell wrote:
>>>>> On 23 October 2012 10:37, Avi Kivity <avi@redhat.com> wrote:
>>>>>> On 10/23/2012 11:32 AM, liu ping fan wrote:
>>>>>>> On Tue, Oct 23, 2012 at 5:07 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>>>>>> On 2012-10-23 07:52, liu ping fan wrote:
>>>>>>>>> On Mon, Oct 22, 2012 at 6:40 PM, Avi Kivity <avi@redhat.com> wrote:
>>>>>>>>>> On 10/22/2012 11:23 AM, Liu Ping Fan wrote:
>>>>>>>>> It will only record and fix the issue on one thread. But guest can
>>>>>>>>> touch the emulated device on muti-threads.
>>>>>>>>
>>>>>>>> Sorry, what does that mean? A second VCPU accessing the device will
>>>>>>>> simply be ignored when it races with another VCPU? Specifically
>>>>>>>>
>>>>>>> Yes, just ignored.  For device which support many logic in parallel,
>>>>>>> it should use independent busy flag for each logic
>>>>>>
>>>>>> We don't actually know that e1000 doesn't.  Why won't writing into
>>>>>> different registers in parallel work?
>>>>>
>>>>> Unless the device we're emulating supports multiple in
>>>>> parallel accesses (and I bet 99.9% of the devices we model
>>>>> don't) then the memory framework needs to serialise the
>>>>> loads/stores. Otherwise it's just going to be excessively
>>>>> hard to write a reliable device model.
>>>>
>>>> That's why we have a per-device lock.  The busy flag breaks that model
>>>> by discarding accesses that occur in parallel.
>>>>
>>> I think by adopting the model, we can avoid this.
>>>
>>> struct device_logic {
>>>   bool busy;
>>>   qemu_mutex lock;
>>>   QemuCond wait;
>>> };
>>>
>>> LOCK(logic->lock)
>>> while (logic->busy) {
>>> qemu_cond_wait(&logic->wait, &logic->lock);
>>> }
>>> ....
>>> do hardware emulation
>>> ...
>>> logic->busy = false;
>> UNLOCK(lock); <-------------------------------------- forget
>>> qemu_cond_signal(&logic->wait);
>>>
>>> This is identical to the biglock's behavior for parallel access to
>>> device for nowadays. And then, the problem left is what level for
>>> parallel we want. If we expect more parallel, we need to degrade the
>>> device into more logic unit.
>
> But where is the remaining added-value of the busy flag then? Everyone
> could just as well be serialized by the lock itself. And even when
> dropping the lock while running the hw emulation, that doesn't change

The key is that local lock is broken, so we rely on a high level lock
-- busy flag.  After threading each subsystem, we will not face broken
local lock issue, and can throw away this design.

> anything to the semantic - and our ABBA problems I sketched yesterday.
>
Oh, ABBA problem can not be solved, I think we need clever deadlock detector.

Regards,
pingfan

> Jan
>
> --
> Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
> Corporate Competence Center Embedded Linux
Peter Maydell - Oct. 29, 2012, 7:50 a.m.
On 29 October 2012 05:24, liu ping fan <qemulist@gmail.com> wrote:
> Oh, ABBA problem can not be solved, I think we need clever deadlock detector.

If you cannot solve the problem then you must remain single threaded.

-- PMM

Patch

diff --git a/hw/e1000.c b/hw/e1000.c
index 5eddab5..0b4fce5 100644
--- a/hw/e1000.c
+++ b/hw/e1000.c
@@ -86,6 +86,7 @@  typedef struct E1000State_st {
     MemoryRegion mmio;
     MemoryRegion io;
     QemuMutex e1000_lock;
+    int busy;
 
     uint32_t mac_reg[0x8000];
     uint16_t phy_reg[0x20];
@@ -1033,6 +1034,11 @@  e1000_mmio_write(void *opaque, target_phys_addr_t addr, uint64_t val,
     E1000State *s = opaque;
     unsigned int index = (addr & 0x1ffff) >> 2;
 
+    if (s->busy) {
+        return;
+    } else {
+        s->busy = 1;
+    }
     if (index < NWRITEOPS && macreg_writeops[index]) {
         macreg_writeops[index](s, index, val);
     } else if (index < NREADOPS && macreg_readops[index]) {
@@ -1041,6 +1047,7 @@  e1000_mmio_write(void *opaque, target_phys_addr_t addr, uint64_t val,
         DBGOUT(UNKNOWN, "MMIO unknown write addr=0x%08x,val=0x%08"PRIx64"\n",
                index<<2, val);
     }
+    s->busy = 0;
 }
 
 static uint64_t
@@ -1048,13 +1055,22 @@  e1000_mmio_read(void *opaque, target_phys_addr_t addr, unsigned size)
 {
     E1000State *s = opaque;
     unsigned int index = (addr & 0x1ffff) >> 2;
+    uint64_t ret = 0;
+
+    if (s->busy) {
+        return ret;
+    } else {
+        s->busy = 1;
+    }
 
     if (index < NREADOPS && macreg_readops[index])
     {
-        return macreg_readops[index](s, index);
+        ret = macreg_readops[index](s, index);
+    } else {
+        DBGOUT(UNKNOWN, "MMIO unknown read addr=0x%08x\n", index<<2);
     }
-    DBGOUT(UNKNOWN, "MMIO unknown read addr=0x%08x\n", index<<2);
-    return 0;
+    s->busy = 0;
+    return ret;
 }
 
 static const MemoryRegionOps e1000_mmio_ops = {
@@ -1242,6 +1258,7 @@  static int pci_e1000_init(PCIDevice *pci_dev)
     uint8_t *macaddr;
 
     qemu_mutex_init(&d->e1000_lock);
+    d->busy = 0;
 
     pci_conf = d->dev.config;