Patchwork [06/22] ehci: Speed up the timer of raising int from the async schedule

login
register
mail settings
Submitter Hans de Goede
Date Oct. 15, 2012, 10:38 a.m.
Message ID <1350297511-25437-7-git-send-email-hdegoede@redhat.com>
Download mbox | patch
Permalink /patch/191530/
State New
Headers show

Comments

Hans de Goede - Oct. 15, 2012, 10:38 a.m.
Often the guest will queue up new packets in response to a packet, in the
async schedule with its IOC flag set, completing. By speeding up the
frame-timer, we notice these new packets earlier. This increases the
speed (MB/s) of a Linux guest reading from a USB mass storage device by a
factor of 1.15 on top of the "Improve latency of interrupt delivery"
speed-ups, both with and without input pipelining enabled.

I've not tested the speed-up of this patch without the
"Improve latency of interrupt delivery" patch.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
 hw/usb/hcd-ehci.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)
Gerd Hoffmann - Oct. 15, 2012, 11:17 a.m.
On 10/15/12 12:38, Hans de Goede wrote:
> Often the guest will queue up new packets in response to a packet, in the
> async schedule with its IOC flag set, completing. By speeding up the
> frame-timer, we notice these new packets earlier. This increases the
> speed (MB/s) of a Linux guest reading from a USB mass storage device by a
> factor of 1.15 on top of the "Improve latency of interrupt delivery"
> speed-ups, both with and without input pipelining enabled.

Why not just set async_stepdown to 0?

cheers,
  Gerd
Hans de Goede - Oct. 15, 2012, 1 p.m.
Hi,

On 10/15/2012 01:17 PM, Gerd Hoffmann wrote:
> On 10/15/12 12:38, Hans de Goede wrote:
>> Often the guest will queue up new packets in response to a packet, in the
>> async schedule with its IOC flag set, completing. By speeding up the
>> frame-timer, we notice these new packets earlier. This increases the
>> speed (MB/s) of a Linux guest reading from a USB mass storage device by a
>> factor of 1.15 on top of the "Improve latency of interrupt delivery"
>> speed-ups, both with and without input pipelining enabled.
>
> Why not just set async_stepdown to 0?

We already do that whenever we run a package completion (it get sets when
we move to the executing stage). What this patch does is request the
frame timer to run again in 500 usecs instead of after 1 ms, thus making
us see and process async transfers faster when they are queued up in
response to just completed packages (which we've told the guest about with
the int interrupt). This makes the USB-bus / device idle time between
any 2 transfers of the 3 transfer involving USB storage BOT time shorter,
thereby speeding things up.

Regards,

Hans
Gerd Hoffmann - Oct. 17, 2012, 11:01 a.m.
On 10/15/12 15:00, Hans de Goede wrote:
> Hi,
> 
> On 10/15/2012 01:17 PM, Gerd Hoffmann wrote:
>> On 10/15/12 12:38, Hans de Goede wrote:
>>> Often the guest will queue up new packets in response to a packet, in
>>> the
>>> async schedule with its IOC flag set, completing. By speeding up the
>>> frame-timer, we notice these new packets earlier. This increases the
>>> speed (MB/s) of a Linux guest reading from a USB mass storage device
>>> by a
>>> factor of 1.15 on top of the "Improve latency of interrupt delivery"
>>> speed-ups, both with and without input pipelining enabled.
>>
>> Why not just set async_stepdown to 0?
> 
> We already do that whenever we run a package completion (it get sets when
> we move to the executing stage). What this patch does is request the
> frame timer to run again in 500 usecs instead of after 1 ms, thus making
> us see and process async transfers faster when they are queued up in
> response to just completed packages (which we've told the guest about with
> the int interrupt). This makes the USB-bus / device idle time between
> any 2 transfers of the 3 transfer involving USB storage BOT time shorter,
> thereby speeding things up.

Don't feel like having two mechanisms for wakeup rate control.  Can't we
integrate this with async_stepdown?  Changing the baseline maybe, so
stepdown=0 doesn't mean 1000 Hz but 2000 Hz?

cheers,
  Gerd
Hans de Goede - Oct. 17, 2012, 11:11 a.m.
Hi,

On 10/17/2012 01:01 PM, Gerd Hoffmann wrote:
> On 10/15/12 15:00, Hans de Goede wrote:
>> Hi,
>>
>> On 10/15/2012 01:17 PM, Gerd Hoffmann wrote:
>>> On 10/15/12 12:38, Hans de Goede wrote:
>>>> Often the guest will queue up new packets in response to a packet, in
>>>> the
>>>> async schedule with its IOC flag set, completing. By speeding up the
>>>> frame-timer, we notice these new packets earlier. This increases the
>>>> speed (MB/s) of a Linux guest reading from a USB mass storage device
>>>> by a
>>>> factor of 1.15 on top of the "Improve latency of interrupt delivery"
>>>> speed-ups, both with and without input pipelining enabled.
>>>
>>> Why not just set async_stepdown to 0?
>>
>> We already do that whenever we run a package completion (it get sets when
>> we move to the executing stage). What this patch does is request the
>> frame timer to run again in 500 usecs instead of after 1 ms, thus making
>> us see and process async transfers faster when they are queued up in
>> response to just completed packages (which we've told the guest about with
>> the int interrupt). This makes the USB-bus / device idle time between
>> any 2 transfers of the 3 transfer involving USB storage BOT time shorter,
>> thereby speeding things up.
>
> Don't feel like having two mechanisms for wakeup rate control.  Can't we
> integrate this with async_stepdown?  Changing the baseline maybe, so
> stepdown=0 doesn't mean 1000 Hz but 2000 Hz?

That is actually close to what I wanted to do at first (I wanted to use
stepdown=-1 for the faster wakeup case). But there are 2 problems with this:

1) It causes migration issues when migrating to / from an old version
2) We don't want to change the wakeup rate when the interrupt flag gets set
as pending, but when it actually gets committed, and we only want to change
the wakeup rate when the int was requested by an async packet, not when it
was requested by a periodic packet, so we will need the int_req_by_async
flag anyways, as which point this seemed the cleanest way.

Regards,

Hans
Gerd Hoffmann - Oct. 17, 2012, 11:37 a.m.
Hi,

> 1) It causes migration issues when migrating to / from an old version

With -1 yes, shifting the scale shoudn't be a that big issue though as
it is just a optimization.

> 2) We don't want to change the wakeup rate when the interrupt flag gets set
> as pending, but when it actually gets committed, and we only want to change
> the wakeup rate when the int was requested by an async packet, not when it
> was requested by a periodic packet, so we will need the int_req_by_async
> flag anyways, as which point this seemed the cleanest way.

Missed that little details, ok then.

cheers,
  Gerd

Patch

diff --git a/hw/usb/hcd-ehci.c b/hw/usb/hcd-ehci.c
index bbfa441..58e788b 100644
--- a/hw/usb/hcd-ehci.c
+++ b/hw/usb/hcd-ehci.c
@@ -443,6 +443,7 @@  struct EHCIState {
 
     uint64_t last_run_ns;
     uint32_t async_stepdown;
+    bool int_req_by_async;
 };
 
 #define SET_LAST_RUN_CLOCK(s) \
@@ -1529,6 +1530,9 @@  static void ehci_execute_complete(EHCIQueue *q)
 
     if (q->qh.token & QTD_TOKEN_IOC) {
         ehci_raise_irq(q->ehci, USBSTS_INT);
+        if (q->async) {
+            q->ehci->int_req_by_async = true;
+        }
     }
 }
 
@@ -2503,8 +2507,15 @@  static void ehci_frame_timer(void *opaque)
     }
 
     if (need_timer) {
-        expire_time = t_now + (get_ticks_per_sec()
+        /* If we've raised int, we speed up the timer, so that we quickly
+         * notice any new packets queued up in response */
+        if (ehci->int_req_by_async && (ehci->usbsts & USBSTS_INT)) {
+            expire_time = t_now + get_ticks_per_sec() / (FRAME_TIMER_FREQ * 2);
+            ehci->int_req_by_async = false;
+        } else {
+            expire_time = t_now + (get_ticks_per_sec()
                                * (ehci->async_stepdown+1) / FRAME_TIMER_FREQ);
+        }
         qemu_mod_timer(ehci->frame_timer, expire_time);
     }
 }