diff mbox

[for,2.5,1/1] e1000: fix hang of win2k12 shutdown with flood ping

Message ID 1448606921-17846-1-git-send-email-den@openvz.org
State New
Headers show

Commit Message

Denis V. Lunev Nov. 27, 2015, 6:48 a.m. UTC
e1000 driver in Win2k12 is really well rotten. It 100% hangs on shutdown
of UP VM under flood ping. The guest checks card state and reinjects
itself interrupt in a loop. This is fatal for UP machine.

There is no good way to fix this misbehavior but to kludge it. The
emulation has interrupt throttling register aka ITR which limits
interrupt rate and allows the guest to proceed this phase.
There is no problem with this kludge for Linux guests - it adjust the
value of it itself.

On the other hand according to the initial research in
    commit e9845f0985f088dd01790f4821026df0afba5795
    Author: Vincenzo Maffione <v.maffione@gmail.com>
    Date:   Fri Aug 2 18:30:52 2013 +0200

    e1000: add interrupt mitigation support

    ...

    Interrupt mitigation boosts performance when the guest suffers from
    an high interrupt rate (i.e. receiving short UDP packets at high packet
    rate). For some numerical results see the following link
    http://info.iet.unipi.it/~luigi/papers/20130520-rizzo-vm.pdf

this should also boost performance a bit.

See https://bugzilla.redhat.com/show_bug.cgi?id=874406 for additional
details.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Vincenzo Maffione <v.maffione@gmail.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/net/e1000.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Denis V. Lunev Nov. 27, 2015, 6:50 a.m. UTC | #1
On 11/27/2015 09:48 AM, Denis V. Lunev wrote:
> e1000 driver in Win2k12 is really well rotten. It 100% hangs on shutdown
> of UP VM under flood ping. The guest checks card state and reinjects
> itself interrupt in a loop. This is fatal for UP machine.
>
> There is no good way to fix this misbehavior but to kludge it. The
> emulation has interrupt throttling register aka ITR which limits
> interrupt rate and allows the guest to proceed this phase.
> There is no problem with this kludge for Linux guests - it adjust the
> value of it itself.
>
> On the other hand according to the initial research in
>      commit e9845f0985f088dd01790f4821026df0afba5795
>      Author: Vincenzo Maffione <v.maffione@gmail.com>
>      Date:   Fri Aug 2 18:30:52 2013 +0200
>
>      e1000: add interrupt mitigation support
>
>      ...
>
>      Interrupt mitigation boosts performance when the guest suffers from
>      an high interrupt rate (i.e. receiving short UDP packets at high packet
>      rate). For some numerical results see the following link
>      http://info.iet.unipi.it/~luigi/papers/20130520-rizzo-vm.pdf
>
> this should also boost performance a bit.
>
> See https://bugzilla.redhat.com/show_bug.cgi?id=874406 for additional
> details.
>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Vincenzo Maffione <v.maffione@gmail.com>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>   hw/net/e1000.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> index c877e06..0af528f 100644
> --- a/hw/net/e1000.c
> +++ b/hw/net/e1000.c
> @@ -447,6 +447,9 @@ static void e1000_reset(void *opaque)
>           e1000_link_down(d);
>       }
>   
> +    /* Throttle interrupts to allow poor Win 2012 to shutdown */
> +    d->mac_reg[ITR] = 250;
> +
>       /* Some guests expect pre-initialized RAH/RAL (AddrValid flag + MACaddr) */
>       d->mac_reg[RA] = 0;
>       d->mac_reg[RA + 1] = E1000_RAH_AV;
Intel manual says about ITR that " A initial suggested range is 651-5580 
(28Bh - 15CCh)."
Should we use something other than 250? :)

http://www.intel.com/content/www/us/en/embedded/products/networking/pci-pci-x-family-gbe-controllers-software-dev-manual.html

Den
Denis V. Lunev Nov. 27, 2015, 11:42 a.m. UTC | #2
On 11/27/2015 09:50 AM, Denis V. Lunev wrote:
> On 11/27/2015 09:48 AM, Denis V. Lunev wrote:
>> e1000 driver in Win2k12 is really well rotten. It 100% hangs on shutdown
>> of UP VM under flood ping. The guest checks card state and reinjects
>> itself interrupt in a loop. This is fatal for UP machine.
>>
>> There is no good way to fix this misbehavior but to kludge it. The
>> emulation has interrupt throttling register aka ITR which limits
>> interrupt rate and allows the guest to proceed this phase.
>> There is no problem with this kludge for Linux guests - it adjust the
>> value of it itself.
>>
>> On the other hand according to the initial research in
>>      commit e9845f0985f088dd01790f4821026df0afba5795
>>      Author: Vincenzo Maffione <v.maffione@gmail.com>
>>      Date:   Fri Aug 2 18:30:52 2013 +0200
>>
>>      e1000: add interrupt mitigation support
>>
>>      ...
>>
>>      Interrupt mitigation boosts performance when the guest suffers from
>>      an high interrupt rate (i.e. receiving short UDP packets at high 
>> packet
>>      rate). For some numerical results see the following link
>> http://info.iet.unipi.it/~luigi/papers/20130520-rizzo-vm.pdf
>>
>> this should also boost performance a bit.
>>
>> See https://bugzilla.redhat.com/show_bug.cgi?id=874406 for additional
>> details.
>>
>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>> CC: Vincenzo Maffione <v.maffione@gmail.com>
>> CC: Stefan Hajnoczi <stefanha@redhat.com>
>> ---
>>   hw/net/e1000.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
>> index c877e06..0af528f 100644
>> --- a/hw/net/e1000.c
>> +++ b/hw/net/e1000.c
>> @@ -447,6 +447,9 @@ static void e1000_reset(void *opaque)
>>           e1000_link_down(d);
>>       }
>>   +    /* Throttle interrupts to allow poor Win 2012 to shutdown */
>> +    d->mac_reg[ITR] = 250;
>> +
>>       /* Some guests expect pre-initialized RAH/RAL (AddrValid flag + 
>> MACaddr) */
>>       d->mac_reg[RA] = 0;
>>       d->mac_reg[RA + 1] = E1000_RAH_AV;
> Intel manual says about ITR that " A initial suggested range is 
> 651-5580 (28Bh - 15CCh)."
> Should we use something other than 250? :)
>
> http://www.intel.com/content/www/us/en/embedded/products/networking/pci-pci-x-family-gbe-controllers-software-dev-manual.html 
>
>
> Den

Jason, can you look to this?

I have rechecked MAINTAINERs file and found that
I have missed you here. Sorry :(

Den
Jason Wang Nov. 30, 2015, 5:58 a.m. UTC | #3
On 11/27/2015 07:42 PM, Denis V. Lunev wrote:
> On 11/27/2015 09:50 AM, Denis V. Lunev wrote:
>> On 11/27/2015 09:48 AM, Denis V. Lunev wrote:
>>> e1000 driver in Win2k12 is really well rotten. It 100% hangs on
>>> shutdown
>>> of UP VM under flood ping. The guest checks card state and reinjects
>>> itself interrupt in a loop. This is fatal for UP machine.
>>>
>>> There is no good way to fix this misbehavior but to kludge it. The
>>> emulation has interrupt throttling register aka ITR which limits
>>> interrupt rate and allows the guest to proceed this phase.
>>> There is no problem with this kludge for Linux guests - it adjust the
>>> value of it itself.
>>>
>>> On the other hand according to the initial research in
>>>      commit e9845f0985f088dd01790f4821026df0afba5795
>>>      Author: Vincenzo Maffione <v.maffione@gmail.com>
>>>      Date:   Fri Aug 2 18:30:52 2013 +0200
>>>
>>>      e1000: add interrupt mitigation support
>>>
>>>      ...
>>>
>>>      Interrupt mitigation boosts performance when the guest suffers
>>> from
>>>      an high interrupt rate (i.e. receiving short UDP packets at
>>> high packet
>>>      rate). For some numerical results see the following link
>>> http://info.iet.unipi.it/~luigi/papers/20130520-rizzo-vm.pdf
>>>
>>> this should also boost performance a bit.
>>>
>>> See https://bugzilla.redhat.com/show_bug.cgi?id=874406 for additional
>>> details.
>>>
>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>>> CC: Vincenzo Maffione <v.maffione@gmail.com>
>>> CC: Stefan Hajnoczi <stefanha@redhat.com>
>>> ---
>>>   hw/net/e1000.c | 3 +++
>>>   1 file changed, 3 insertions(+)
>>>
>>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
>>> index c877e06..0af528f 100644
>>> --- a/hw/net/e1000.c
>>> +++ b/hw/net/e1000.c
>>> @@ -447,6 +447,9 @@ static void e1000_reset(void *opaque)
>>>           e1000_link_down(d);
>>>       }
>>>   +    /* Throttle interrupts to allow poor Win 2012 to shutdown */
>>> +    d->mac_reg[ITR] = 250;
>>> +
>>>       /* Some guests expect pre-initialized RAH/RAL (AddrValid flag
>>> + MACaddr) */
>>>       d->mac_reg[RA] = 0;
>>>       d->mac_reg[RA + 1] = E1000_RAH_AV;
>> Intel manual says about ITR that " A initial suggested range is
>> 651-5580 (28Bh - 15CCh)."
>> Should we use something other than 250? :)
>>
>> http://www.intel.com/content/www/us/en/embedded/products/networking/pci-pci-x-family-gbe-controllers-software-dev-manual.html
>>
>>
>> Den
>
> Jason, can you look to this?
>
> I have rechecked MAINTAINERs file and found that
> I have missed you here. Sorry :(
>
> Den
>

No problem.

But I have a question. What if ITR is disabled?
Denis V. Lunev Nov. 30, 2015, 6:22 a.m. UTC | #4
On 11/30/2015 08:58 AM, Jason Wang wrote:
>
> On 11/27/2015 07:42 PM, Denis V. Lunev wrote:
>> On 11/27/2015 09:50 AM, Denis V. Lunev wrote:
>>> On 11/27/2015 09:48 AM, Denis V. Lunev wrote:
>>>> e1000 driver in Win2k12 is really well rotten. It 100% hangs on
>>>> shutdown
>>>> of UP VM under flood ping. The guest checks card state and reinjects
>>>> itself interrupt in a loop. This is fatal for UP machine.
>>>>
>>>> There is no good way to fix this misbehavior but to kludge it. The
>>>> emulation has interrupt throttling register aka ITR which limits
>>>> interrupt rate and allows the guest to proceed this phase.
>>>> There is no problem with this kludge for Linux guests - it adjust the
>>>> value of it itself.
>>>>
>>>> On the other hand according to the initial research in
>>>>       commit e9845f0985f088dd01790f4821026df0afba5795
>>>>       Author: Vincenzo Maffione <v.maffione@gmail.com>
>>>>       Date:   Fri Aug 2 18:30:52 2013 +0200
>>>>
>>>>       e1000: add interrupt mitigation support
>>>>
>>>>       ...
>>>>
>>>>       Interrupt mitigation boosts performance when the guest suffers
>>>> from
>>>>       an high interrupt rate (i.e. receiving short UDP packets at
>>>> high packet
>>>>       rate). For some numerical results see the following link
>>>> http://info.iet.unipi.it/~luigi/papers/20130520-rizzo-vm.pdf
>>>>
>>>> this should also boost performance a bit.
>>>>
>>>> See https://bugzilla.redhat.com/show_bug.cgi?id=874406 for additional
>>>> details.
>>>>
>>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>>>> CC: Vincenzo Maffione <v.maffione@gmail.com>
>>>> CC: Stefan Hajnoczi <stefanha@redhat.com>
>>>> ---
>>>>    hw/net/e1000.c | 3 +++
>>>>    1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
>>>> index c877e06..0af528f 100644
>>>> --- a/hw/net/e1000.c
>>>> +++ b/hw/net/e1000.c
>>>> @@ -447,6 +447,9 @@ static void e1000_reset(void *opaque)
>>>>            e1000_link_down(d);
>>>>        }
>>>>    +    /* Throttle interrupts to allow poor Win 2012 to shutdown */
>>>> +    d->mac_reg[ITR] = 250;
>>>> +
>>>>        /* Some guests expect pre-initialized RAH/RAL (AddrValid flag
>>>> + MACaddr) */
>>>>        d->mac_reg[RA] = 0;
>>>>        d->mac_reg[RA + 1] = E1000_RAH_AV;
>>> Intel manual says about ITR that " A initial suggested range is
>>> 651-5580 (28Bh - 15CCh)."
>>> Should we use something other than 250? :)
>>>
>>> http://www.intel.com/content/www/us/en/embedded/products/networking/pci-pci-x-family-gbe-controllers-software-dev-manual.html
>>>
>>>
>>> Den
>> Jason, can you look to this?
>>
>> I have rechecked MAINTAINERs file and found that
>> I have missed you here. Sorry :(
>>
>> Den
>>
> No problem.
>
> But I have a question. What if ITR is disabled?
>

On behalf of guest  I do not think that this is really true.
In this case the guest should set it to a real value and
after that clear it. This is not the case - my patch
applies on a reset only, i.e. the guest do not care at all
on this and the value lives "as is". I think that real card
behaves in a similar way, it could not generate interrupts
with the speed of any hypervisor, i.e. there is natural
limitation which allows to bypass this problem or there
is a default value.

On behalf of QEMU the question is still here. Fortunately
the handle (mitigation flag) is on by default. I think that
it exists to preserve compatibility with QEMU 1.6
In a real life nobody will turn it off until the person is
know what he is doing ;)

Den
Jason Wang Dec. 1, 2015, 3:31 a.m. UTC | #5
On 11/30/2015 02:22 PM, Denis V. Lunev wrote:
> On 11/30/2015 08:58 AM, Jason Wang wrote:
>>
>> On 11/27/2015 07:42 PM, Denis V. Lunev wrote:
>>> On 11/27/2015 09:50 AM, Denis V. Lunev wrote:
>>>> On 11/27/2015 09:48 AM, Denis V. Lunev wrote:
>>>>> e1000 driver in Win2k12 is really well rotten. It 100% hangs on
>>>>> shutdown
>>>>> of UP VM under flood ping. The guest checks card state and reinjects
>>>>> itself interrupt in a loop. This is fatal for UP machine.
>>>>>
>>>>> There is no good way to fix this misbehavior but to kludge it. The
>>>>> emulation has interrupt throttling register aka ITR which limits
>>>>> interrupt rate and allows the guest to proceed this phase.
>>>>> There is no problem with this kludge for Linux guests - it adjust the
>>>>> value of it itself.
>>>>>
>>>>> On the other hand according to the initial research in
>>>>>       commit e9845f0985f088dd01790f4821026df0afba5795
>>>>>       Author: Vincenzo Maffione <v.maffione@gmail.com>
>>>>>       Date:   Fri Aug 2 18:30:52 2013 +0200
>>>>>
>>>>>       e1000: add interrupt mitigation support
>>>>>
>>>>>       ...
>>>>>
>>>>>       Interrupt mitigation boosts performance when the guest suffers
>>>>> from
>>>>>       an high interrupt rate (i.e. receiving short UDP packets at
>>>>> high packet
>>>>>       rate). For some numerical results see the following link
>>>>> http://info.iet.unipi.it/~luigi/papers/20130520-rizzo-vm.pdf
>>>>>
>>>>> this should also boost performance a bit.
>>>>>
>>>>> See https://bugzilla.redhat.com/show_bug.cgi?id=874406 for additional
>>>>> details.
>>>>>
>>>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>>>>> CC: Vincenzo Maffione <v.maffione@gmail.com>
>>>>> CC: Stefan Hajnoczi <stefanha@redhat.com>
>>>>> ---
>>>>>    hw/net/e1000.c | 3 +++
>>>>>    1 file changed, 3 insertions(+)
>>>>>
>>>>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
>>>>> index c877e06..0af528f 100644
>>>>> --- a/hw/net/e1000.c
>>>>> +++ b/hw/net/e1000.c
>>>>> @@ -447,6 +447,9 @@ static void e1000_reset(void *opaque)
>>>>>            e1000_link_down(d);
>>>>>        }
>>>>>    +    /* Throttle interrupts to allow poor Win 2012 to shutdown */
>>>>> +    d->mac_reg[ITR] = 250;
>>>>> +
>>>>>        /* Some guests expect pre-initialized RAH/RAL (AddrValid flag
>>>>> + MACaddr) */
>>>>>        d->mac_reg[RA] = 0;
>>>>>        d->mac_reg[RA + 1] = E1000_RAH_AV;
>>>> Intel manual says about ITR that " A initial suggested range is
>>>> 651-5580 (28Bh - 15CCh)."
>>>> Should we use something other than 250? :)
>>>>
>>>> http://www.intel.com/content/www/us/en/embedded/products/networking/pci-pci-x-family-gbe-controllers-software-dev-manual.html
>>>>
>>>>
>>>>
>>>> Den
>>> Jason, can you look to this?
>>>
>>> I have rechecked MAINTAINERs file and found that
>>> I have missed you here. Sorry :(
>>>
>>> Den
>>>
>> No problem.
>>
>> But I have a question. What if ITR is disabled?
>>
>
> On behalf of guest  I do not think that this is really true.
> In this case the guest should set it to a real value and
> after that clear it. This is not the case - my patch
> applies on a reset only, i.e. the guest do not care at all
> on this and the value lives "as is". I think that real card
> behaves in a similar way, it could not generate interrupts
> with the speed of any hypervisor, i.e. there is natural
> limitation which allows to bypass this problem or there
> is a default value.
>
> On behalf of QEMU the question is still here. Fortunately
> the handle (mitigation flag) is on by default. I think that
> it exists to preserve compatibility with QEMU 1.6
> In a real life nobody will turn it off until the person is
> know what he is doing ;)
>
> Den

Ok, apply to my -net with minor tweaks and adding a TODO in the comment.

We've met several similar issues in the past, need to consider a
complete solution in the future otherwise we may still hit something
like this in the future.

Thanks
Denis V. Lunev Dec. 1, 2015, 9:38 a.m. UTC | #6
On 12/01/2015 06:31 AM, Jason Wang wrote:
>
> On 11/30/2015 02:22 PM, Denis V. Lunev wrote:
>> On 11/30/2015 08:58 AM, Jason Wang wrote:
>>> On 11/27/2015 07:42 PM, Denis V. Lunev wrote:
>>>> On 11/27/2015 09:50 AM, Denis V. Lunev wrote:
>>>>> On 11/27/2015 09:48 AM, Denis V. Lunev wrote:
>>>>>> e1000 driver in Win2k12 is really well rotten. It 100% hangs on
>>>>>> shutdown
>>>>>> of UP VM under flood ping. The guest checks card state and reinjects
>>>>>> itself interrupt in a loop. This is fatal for UP machine.
>>>>>>
>>>>>> There is no good way to fix this misbehavior but to kludge it. The
>>>>>> emulation has interrupt throttling register aka ITR which limits
>>>>>> interrupt rate and allows the guest to proceed this phase.
>>>>>> There is no problem with this kludge for Linux guests - it adjust the
>>>>>> value of it itself.
>>>>>>
>>>>>> On the other hand according to the initial research in
>>>>>>        commit e9845f0985f088dd01790f4821026df0afba5795
>>>>>>        Author: Vincenzo Maffione <v.maffione@gmail.com>
>>>>>>        Date:   Fri Aug 2 18:30:52 2013 +0200
>>>>>>
>>>>>>        e1000: add interrupt mitigation support
>>>>>>
>>>>>>        ...
>>>>>>
>>>>>>        Interrupt mitigation boosts performance when the guest suffers
>>>>>> from
>>>>>>        an high interrupt rate (i.e. receiving short UDP packets at
>>>>>> high packet
>>>>>>        rate). For some numerical results see the following link
>>>>>> http://info.iet.unipi.it/~luigi/papers/20130520-rizzo-vm.pdf
>>>>>>
>>>>>> this should also boost performance a bit.
>>>>>>
>>>>>> See https://bugzilla.redhat.com/show_bug.cgi?id=874406 for additional
>>>>>> details.
>>>>>>
>>>>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>>>>>> CC: Vincenzo Maffione <v.maffione@gmail.com>
>>>>>> CC: Stefan Hajnoczi <stefanha@redhat.com>
>>>>>> ---
>>>>>>     hw/net/e1000.c | 3 +++
>>>>>>     1 file changed, 3 insertions(+)
>>>>>>
>>>>>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
>>>>>> index c877e06..0af528f 100644
>>>>>> --- a/hw/net/e1000.c
>>>>>> +++ b/hw/net/e1000.c
>>>>>> @@ -447,6 +447,9 @@ static void e1000_reset(void *opaque)
>>>>>>             e1000_link_down(d);
>>>>>>         }
>>>>>>     +    /* Throttle interrupts to allow poor Win 2012 to shutdown */
>>>>>> +    d->mac_reg[ITR] = 250;
>>>>>> +
>>>>>>         /* Some guests expect pre-initialized RAH/RAL (AddrValid flag
>>>>>> + MACaddr) */
>>>>>>         d->mac_reg[RA] = 0;
>>>>>>         d->mac_reg[RA + 1] = E1000_RAH_AV;
>>>>> Intel manual says about ITR that " A initial suggested range is
>>>>> 651-5580 (28Bh - 15CCh)."
>>>>> Should we use something other than 250? :)
>>>>>
>>>>> http://www.intel.com/content/www/us/en/embedded/products/networking/pci-pci-x-family-gbe-controllers-software-dev-manual.html
>>>>>
>>>>>
>>>>>
>>>>> Den
>>>> Jason, can you look to this?
>>>>
>>>> I have rechecked MAINTAINERs file and found that
>>>> I have missed you here. Sorry :(
>>>>
>>>> Den
>>>>
>>> No problem.
>>>
>>> But I have a question. What if ITR is disabled?
>>>
>> On behalf of guest  I do not think that this is really true.
>> In this case the guest should set it to a real value and
>> after that clear it. This is not the case - my patch
>> applies on a reset only, i.e. the guest do not care at all
>> on this and the value lives "as is". I think that real card
>> behaves in a similar way, it could not generate interrupts
>> with the speed of any hypervisor, i.e. there is natural
>> limitation which allows to bypass this problem or there
>> is a default value.
>>
>> On behalf of QEMU the question is still here. Fortunately
>> the handle (mitigation flag) is on by default. I think that
>> it exists to preserve compatibility with QEMU 1.6
>> In a real life nobody will turn it off until the person is
>> know what he is doing ;)
>>
>> Den
> Ok, apply to my -net with minor tweaks and adding a TODO in the comment.
>
> We've met several similar issues in the past, need to consider a
> complete solution in the future otherwise we may still hit something
> like this in the future.
>
> Thanks
thank you.

Can you pls clarify, will it go to 2.5 or no?

Den
Jason Wang Dec. 2, 2015, 5:06 a.m. UTC | #7
On 12/01/2015 05:38 PM, Denis V. Lunev wrote:
> On 12/01/2015 06:31 AM, Jason Wang wrote:
>>
>> On 11/30/2015 02:22 PM, Denis V. Lunev wrote:
>>> On 11/30/2015 08:58 AM, Jason Wang wrote:
>>>> On 11/27/2015 07:42 PM, Denis V. Lunev wrote:
>>>>> On 11/27/2015 09:50 AM, Denis V. Lunev wrote:
>>>>>> On 11/27/2015 09:48 AM, Denis V. Lunev wrote:
>>>>>>> e1000 driver in Win2k12 is really well rotten. It 100% hangs on
>>>>>>> shutdown
>>>>>>> of UP VM under flood ping. The guest checks card state and
>>>>>>> reinjects
>>>>>>> itself interrupt in a loop. This is fatal for UP machine.
>>>>>>>
>>>>>>> There is no good way to fix this misbehavior but to kludge it. The
>>>>>>> emulation has interrupt throttling register aka ITR which limits
>>>>>>> interrupt rate and allows the guest to proceed this phase.
>>>>>>> There is no problem with this kludge for Linux guests - it
>>>>>>> adjust the
>>>>>>> value of it itself.
>>>>>>>
>>>>>>> On the other hand according to the initial research in
>>>>>>>        commit e9845f0985f088dd01790f4821026df0afba5795
>>>>>>>        Author: Vincenzo Maffione <v.maffione@gmail.com>
>>>>>>>        Date:   Fri Aug 2 18:30:52 2013 +0200
>>>>>>>
>>>>>>>        e1000: add interrupt mitigation support
>>>>>>>
>>>>>>>        ...
>>>>>>>
>>>>>>>        Interrupt mitigation boosts performance when the guest
>>>>>>> suffers
>>>>>>> from
>>>>>>>        an high interrupt rate (i.e. receiving short UDP packets at
>>>>>>> high packet
>>>>>>>        rate). For some numerical results see the following link
>>>>>>> http://info.iet.unipi.it/~luigi/papers/20130520-rizzo-vm.pdf
>>>>>>>
>>>>>>> this should also boost performance a bit.
>>>>>>>
>>>>>>> See https://bugzilla.redhat.com/show_bug.cgi?id=874406 for
>>>>>>> additional
>>>>>>> details.
>>>>>>>
>>>>>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>>>>>>> CC: Vincenzo Maffione <v.maffione@gmail.com>
>>>>>>> CC: Stefan Hajnoczi <stefanha@redhat.com>
>>>>>>> ---
>>>>>>>     hw/net/e1000.c | 3 +++
>>>>>>>     1 file changed, 3 insertions(+)
>>>>>>>
>>>>>>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
>>>>>>> index c877e06..0af528f 100644
>>>>>>> --- a/hw/net/e1000.c
>>>>>>> +++ b/hw/net/e1000.c
>>>>>>> @@ -447,6 +447,9 @@ static void e1000_reset(void *opaque)
>>>>>>>             e1000_link_down(d);
>>>>>>>         }
>>>>>>>     +    /* Throttle interrupts to allow poor Win 2012 to
>>>>>>> shutdown */
>>>>>>> +    d->mac_reg[ITR] = 250;
>>>>>>> +
>>>>>>>         /* Some guests expect pre-initialized RAH/RAL (AddrValid
>>>>>>> flag
>>>>>>> + MACaddr) */
>>>>>>>         d->mac_reg[RA] = 0;
>>>>>>>         d->mac_reg[RA + 1] = E1000_RAH_AV;
>>>>>> Intel manual says about ITR that " A initial suggested range is
>>>>>> 651-5580 (28Bh - 15CCh)."
>>>>>> Should we use something other than 250? :)
>>>>>>
>>>>>> http://www.intel.com/content/www/us/en/embedded/products/networking/pci-pci-x-family-gbe-controllers-software-dev-manual.html
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Den
>>>>> Jason, can you look to this?
>>>>>
>>>>> I have rechecked MAINTAINERs file and found that
>>>>> I have missed you here. Sorry :(
>>>>>
>>>>> Den
>>>>>
>>>> No problem.
>>>>
>>>> But I have a question. What if ITR is disabled?
>>>>
>>> On behalf of guest  I do not think that this is really true.
>>> In this case the guest should set it to a real value and
>>> after that clear it. This is not the case - my patch
>>> applies on a reset only, i.e. the guest do not care at all
>>> on this and the value lives "as is". I think that real card
>>> behaves in a similar way, it could not generate interrupts
>>> with the speed of any hypervisor, i.e. there is natural
>>> limitation which allows to bypass this problem or there
>>> is a default value.
>>>
>>> On behalf of QEMU the question is still here. Fortunately
>>> the handle (mitigation flag) is on by default. I think that
>>> it exists to preserve compatibility with QEMU 1.6
>>> In a real life nobody will turn it off until the person is
>>> know what he is doing ;)
>>>
>>> Den
>> Ok, apply to my -net with minor tweaks and adding a TODO in the comment.
>>
>> We've met several similar issues in the past, need to consider a
>> complete solution in the future otherwise we may still hit something
>> like this in the future.
>>
>> Thanks
> thank you.
>
> Can you pls clarify, will it go to 2.5 or no?
>
> Den

It will go to 2.5. Plan to include this in my last pull request for 2.5.

Thanks
Peter Maydell Dec. 3, 2015, 2:43 p.m. UTC | #8
On 2 December 2015 at 05:06, Jason Wang <jasowang@redhat.com> wrote:
> It will go to 2.5. Plan to include this in my last pull request for 2.5.

Are you planning to send that pullreq today?

thanks
-- PMM
diff mbox

Patch

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index c877e06..0af528f 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -447,6 +447,9 @@  static void e1000_reset(void *opaque)
         e1000_link_down(d);
     }
 
+    /* Throttle interrupts to allow poor Win 2012 to shutdown */
+    d->mac_reg[ITR] = 250;
+
     /* Some guests expect pre-initialized RAH/RAL (AddrValid flag + MACaddr) */
     d->mac_reg[RA] = 0;
     d->mac_reg[RA + 1] = E1000_RAH_AV;