Patchwork [15/22] usb: Add packet combining functions

login
register
mail settings
Submitter Hans de Goede
Date Oct. 24, 2012, 4:14 p.m.
Message ID <1351095258-5579-16-git-send-email-hdegoede@redhat.com>
Download mbox | patch
Permalink /patch/193848/
State New
Headers show

Comments

Hans de Goede - Oct. 24, 2012, 4:14 p.m.
Currently we only do pipelining for output endpoints, since to properly
support short-not-ok semantics we can only have one outstanding input
packet. Since the ehci and uhci controllers have a limited per td packet
size guests will split large input transfers to into multiple packets,
and since we don't pipeline these, this comes with a serious performance
penalty.

This patch adds helper functions to (re-)combine packets which belong to 1
transfer at the guest device-driver level into 1 large transger. This can be
used by (redirection) usb-devices to enable pipelining for input endpoints.

This patch will combine packets together until a transfer terminating packet
is encountered. A terminating packet is a packet which meets one or more of
the following conditions:
1) The packet size is *not* a multiple of the endpoint max packet size
2) The packet does *not* have its short-not-ok flag set
3) The packet has its interrupt-on-complete flag set

The short-not-ok flag of the combined packet is that of the terminating packet.
Multiple combined packets may be submitted to the device, if the combined
packets do not have their short-not-ok flag set, enabling true pipelining.

If a combined packet does have its short-not-ok flag set the queue will
wait with submitting further packets to the device until that packet has
completed.

Once enabled in the usb-redir and ehci code, this improves the speed (MB/s)
of a Linux guest reading from a USB mass storage device by a factor of
1.2 - 1.5.

And the main reason why I started working on this, when reading from a pl2303
USB<->serial converter, it combines the previous 4 packets submitted per
device-driver level read into 1 big read, reducing the number of packets / sec
by a factor 4, and it allows to have multiple reads outstanding. This allows
for much better latency tolerance without the pl2303's internal buffer
overflowing (which was happening at 115200 bps, without serial flow control).

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
 hw/usb.h                 |  21 ++++++
 hw/usb/Makefile.objs     |   2 +-
 hw/usb/combined-packet.c | 183 +++++++++++++++++++++++++++++++++++++++++++++++
 hw/usb/core.c            |   1 +
 4 files changed, 206 insertions(+), 1 deletion(-)
 create mode 100644 hw/usb/combined-packet.c
Gerd Hoffmann - Oct. 25, 2012, 6:55 a.m.
On 10/24/12 18:14, Hans de Goede wrote:
> +    /*
> +     * Process / cancel combined packets, called from
> +     * usb_ep_combine_input_packets() / usb_combined_packet_cancel().
> +     * Only called for devices which call these functions themselves.
> +     */
> +    int (*handle_combined_data)(USBDevice *dev, USBPacket *p);
> +    void (*cancel_combined_packet)(USBDevice *dev, USBPacket *p);

I still think these should get a USBCombinedPacket not a USBPacket.

cheers,
  Gerd
Hans de Goede - Oct. 30, 2012, 1:23 p.m.
Hi,

On 10/25/2012 08:55 AM, Gerd Hoffmann wrote:
> On 10/24/12 18:14, Hans de Goede wrote:
>> +    /*
>> +     * Process / cancel combined packets, called from
>> +     * usb_ep_combine_input_packets() / usb_combined_packet_cancel().
>> +     * Only called for devices which call these functions themselves.
>> +     */
>> +    int (*handle_combined_data)(USBDevice *dev, USBPacket *p);
>> +    void (*cancel_combined_packet)(USBDevice *dev, USBPacket *p);
>
> I still think these should get a USBCombinedPacket not a USBPacket.

I just rebased my tree's USB bits to your usb.68 pull req, and then
tried to make this change, and then I realized again why at least
handle_combined_data is not getting a USBCombinedPacket as argument.

The call sequence goes like this:

1) hcd calls usb_handle_packet
2) usb_handle_packet calls devices handle_data (through process_one)
3) device's handle_data sees this is for a input ep on which it is
    doing input pipelining, returns USB_RET_ADD_TO_QUEUE
4) hcd calls usb_device_flush_ep_queue
5) usb_device_flush_ep_queue calls usb_ep_combine_input_packets
6) usb_ep_combine_input_packets either ends up with a combined
    packet, or with a single regular packet to send to
    the device

Currently usb_ep_combine_input_packets calls the device's
handle_combined_data method in both cases, and that can distinguish
between the 2 scenarios by checking the passed in USBPacket's
combined field.

I did things this way, even though it may seem more logical for
usb_ep_combine_input_packets to call the device's "regular"
handle_data method in case no combining is done for a packet,
so it is submitting a single regular packet, but in that case
we would end up at step 3) again, and the device's handle_data
will again return USB_RET_ADD_TO_QUEUE which is not what we want.

This is why handle_combined_data takes a USBPacket, and then checks
USBPacket->combined to see what to do, rather then taking a
USBCombinedPacket, as usb_ep_combine_input_packets simply does
not always have a combined packet to pass.

Alternatives to allow handle_combined_data to take a
USBCombinedPacket, would be:
1) Some flag to the device's handle_data method to indicate
it should *really* process the packet and not return
USB_RET_ADD_TO_QUEUE
2) Always make Uc allocate a USBCombinedPacket,
even when the entire transfer consists of only a single
packet, note that this in essence means an unnecessary
malloc + free call per such packet, and for example with
(redirected) usb-mass-storage one can expect each scsi
sense phase transfer to be only a single packet large,
and for smaller reads the data phase packets as well!

IMHO either alternative is worse then simply passing
a USBPacket to handle_combined_data, and let the
device's handle_combined_data figure out what to do
based on USBPacket->combined. Note that if it were
to take a USBCombinedPacket, it would end up getting
back to the USBPacket itself through
USBCombinedPacket->first anyways to get info from there
such as the packet id.

Regards,

Hans
Hans de Goede - Oct. 30, 2012, 1:25 p.m.
Hi,

On 10/30/2012 02:23 PM, Hans de Goede wrote:
> Hi,
>
> On 10/25/2012 08:55 AM, Gerd Hoffmann wrote:
>> On 10/24/12 18:14, Hans de Goede wrote:
>>> +    /*
>>> +     * Process / cancel combined packets, called from
>>> +     * usb_ep_combine_input_packets() / usb_combined_packet_cancel().
>>> +     * Only called for devices which call these functions themselves.
>>> +     */
>>> +    int (*handle_combined_data)(USBDevice *dev, USBPacket *p);
>>> +    void (*cancel_combined_packet)(USBDevice *dev, USBPacket *p);
>>
>> I still think these should get a USBCombinedPacket not a USBPacket.
>
> I just rebased my tree's USB bits to your usb.68 pull req, and then
> tried to make this change, and then I realized again why at least
> handle_combined_data is not getting a USBCombinedPacket as argument.
>
> The call sequence goes like this:
>
> 1) hcd calls usb_handle_packet
> 2) usb_handle_packet calls devices handle_data (through process_one)
> 3) device's handle_data sees this is for a input ep on which it is
>     doing input pipelining, returns USB_RET_ADD_TO_QUEUE
> 4) hcd calls usb_device_flush_ep_queue
> 5) usb_device_flush_ep_queue calls usb_ep_combine_input_packets
> 6) usb_ep_combine_input_packets either ends up with a combined
>     packet, or with a single regular packet to send to
>     the device
>
> Currently usb_ep_combine_input_packets calls the device's
> handle_combined_data method in both cases, and that can distinguish
> between the 2 scenarios by checking the passed in USBPacket's
> combined field.
>
> I did things this way, even though it may seem more logical for
> usb_ep_combine_input_packets to call the device's "regular"
> handle_data method in case no combining is done for a packet,
> so it is submitting a single regular packet, but in that case
> we would end up at step 3) again, and the device's handle_data
> will again return USB_RET_ADD_TO_QUEUE which is not what we want.
>
> This is why handle_combined_data takes a USBPacket, and then checks
> USBPacket->combined to see what to do, rather then taking a
> USBCombinedPacket, as usb_ep_combine_input_packets simply does
> not always have a combined packet to pass.
>
> Alternatives to allow handle_combined_data to take a
> USBCombinedPacket, would be:
> 1) Some flag to the device's handle_data method to indicate
> it should *really* process the packet and not return
> USB_RET_ADD_TO_QUEUE
> 2) Always make Uc allocate a USBCombinedPacket,
> even when the entire transfer consists of only a single
> packet, note that this in essence means an unnecessary
> malloc + free call per such packet, and for example with
> (redirected) usb-mass-storage one can expect each scsi
> sense phase transfer to be only a single packet large,
> and for smaller reads the data phase packets as well!
>
> IMHO either alternative is worse then simply passing
> a USBPacket to handle_combined_data, and let the
> device's handle_combined_data figure out what to do
> based on USBPacket->combined. Note that if it were
> to take a USBCombinedPacket, it would end up getting
> back to the USBPacket itself through
> USBCombinedPacket->first anyways to get info from there
> such as the packet id.
>

p.s.

If we stick with handle_combined_data taking a USBPacket
as argument, then I would like to do the same for
cancel_combined_packet for consistency!

Regards,

Hans
Hans de Goede - Oct. 30, 2012, 1:38 p.m.
Hi,

On 10/30/2012 02:23 PM, Hans de Goede wrote:
> Hi,
>
> On 10/25/2012 08:55 AM, Gerd Hoffmann wrote:
>> On 10/24/12 18:14, Hans de Goede wrote:
>>> +    /*
>>> +     * Process / cancel combined packets, called from
>>> +     * usb_ep_combine_input_packets() / usb_combined_packet_cancel().
>>> +     * Only called for devices which call these functions themselves.
>>> +     */
>>> +    int (*handle_combined_data)(USBDevice *dev, USBPacket *p);
>>> +    void (*cancel_combined_packet)(USBDevice *dev, USBPacket *p);
>>
>> I still think these should get a USBCombinedPacket not a USBPacket.
>
> I just rebased my tree's USB bits to your usb.68 pull req, and then
> tried to make this change, and then I realized again why at least
> handle_combined_data is not getting a USBCombinedPacket as argument.
>
> The call sequence goes like this:
>
> 1) hcd calls usb_handle_packet
> 2) usb_handle_packet calls devices handle_data (through process_one)
> 3) device's handle_data sees this is for a input ep on which it is
>     doing input pipelining, returns USB_RET_ADD_TO_QUEUE
> 4) hcd calls usb_device_flush_ep_queue
> 5) usb_device_flush_ep_queue calls usb_ep_combine_input_packets
> 6) usb_ep_combine_input_packets either ends up with a combined
>     packet, or with a single regular packet to send to
>     the device
>
> Currently usb_ep_combine_input_packets calls the device's
> handle_combined_data method in both cases, and that can distinguish
> between the 2 scenarios by checking the passed in USBPacket's
> combined field.
>
> I did things this way, even though it may seem more logical for
> usb_ep_combine_input_packets to call the device's "regular"
> handle_data method in case no combining is done for a packet,
> so it is submitting a single regular packet, but in that case
> we would end up at step 3) again, and the device's handle_data
> will again return USB_RET_ADD_TO_QUEUE which is not what we want.

Oh wait, thinking more about this, the device's handle_data
method can see the difference between the 1st and 2nd call,
as the packets state will have changed from USB_PACKET_SETUP
to USB_PACKET_QUEUED. Using that, we can do even better and
completely get rid of the handle_combined_data and
cancel_combined_packet methods. Instead have the device code
check for USBPacket->combined where appropriate, this also
makes the handle_data and cancel_packet paths more alike,
as the device already needed to check for USBPacket->combined
in its cancel_packet method.

New version coming up!

Regards,

Hans

Patch

diff --git a/hw/usb.h b/hw/usb.h
index 3a6cc84..3c8ba01 100644
--- a/hw/usb.h
+++ b/hw/usb.h
@@ -160,6 +160,7 @@  typedef struct USBBusOps USBBusOps;
 typedef struct USBPort USBPort;
 typedef struct USBDevice USBDevice;
 typedef struct USBPacket USBPacket;
+typedef struct USBCombinedPacket USBCombinedPacket;
 typedef struct USBEndpoint USBEndpoint;
 
 typedef struct USBDesc USBDesc;
@@ -292,6 +293,14 @@  typedef struct USBDeviceClass {
      */
     int (*handle_data)(USBDevice *dev, USBPacket *p);
 
+    /*
+     * Process / cancel combined packets, called from
+     * usb_ep_combine_input_packets() / usb_combined_packet_cancel().
+     * Only called for devices which call these functions themselves.
+     */
+    int (*handle_combined_data)(USBDevice *dev, USBPacket *p);
+    void (*cancel_combined_packet)(USBDevice *dev, USBPacket *p);
+
     void (*set_interface)(USBDevice *dev, int interface,
                           int alt_old, int alt_new);
 
@@ -356,7 +365,15 @@  struct USBPacket {
     int result; /* transfer length or USB_RET_* status code */
     /* Internal use by the USB layer.  */
     USBPacketState state;
+    USBCombinedPacket *combined;
     QTAILQ_ENTRY(USBPacket) queue;
+    QTAILQ_ENTRY(USBPacket) combined_entry;
+};
+
+struct USBCombinedPacket {
+    USBPacket *first;
+    QTAILQ_HEAD(packets_head, USBPacket) packets;
+    QEMUIOVector iov;
 };
 
 void usb_packet_init(USBPacket *p);
@@ -399,6 +416,10 @@  void usb_ep_set_pipeline(USBDevice *dev, int pid, int ep, bool enabled);
 USBPacket *usb_ep_find_packet_by_id(USBDevice *dev, int pid, int ep,
                                     uint64_t id);
 
+void usb_ep_combine_input_packets(USBEndpoint *ep);
+void usb_combined_input_packet_complete(USBDevice *dev, USBPacket *p);
+void usb_combined_packet_cancel(USBDevice *dev, USBPacket *p);
+
 void usb_attach(USBPort *port);
 void usb_detach(USBPort *port);
 void usb_port_reset(USBPort *port);
diff --git a/hw/usb/Makefile.objs b/hw/usb/Makefile.objs
index 4225136..b193321 100644
--- a/hw/usb/Makefile.objs
+++ b/hw/usb/Makefile.objs
@@ -7,7 +7,7 @@  hw-obj-y += libhw.o
 hw-obj-$(CONFIG_SMARTCARD) += dev-smartcard-reader.o
 hw-obj-$(CONFIG_USB_REDIR) += redirect.o
 
-common-obj-y += core.o bus.o desc.o dev-hub.o
+common-obj-y += core.o combined-packet.o bus.o desc.o dev-hub.o
 common-obj-y += host-$(HOST_USB).o dev-bluetooth.o
 common-obj-y += dev-hid.o dev-storage.o dev-wacom.o
 common-obj-y += dev-serial.o dev-network.o dev-audio.o
diff --git a/hw/usb/combined-packet.c b/hw/usb/combined-packet.c
new file mode 100644
index 0000000..5f92ef9
--- /dev/null
+++ b/hw/usb/combined-packet.c
@@ -0,0 +1,183 @@ 
+/*
+ * QEMU USB packet combining code (for input pipelining)
+ *
+ * Copyright(c) 2012 Red Hat, Inc.
+ *
+ * Red Hat Authors:
+ * Hans de Goede <hdegoede@redhat.com>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or(at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+#include "qemu-common.h"
+#include "hw/usb.h"
+#include "iov.h"
+#include "trace.h"
+
+static void usb_combined_packet_add(USBCombinedPacket *combined, USBPacket *p)
+{
+    qemu_iovec_concat(&combined->iov, &p->iov, 0, p->iov.size);
+    QTAILQ_INSERT_TAIL(&combined->packets, p, combined_entry);
+    p->combined = combined;
+}
+
+static void usb_combined_packet_remove(USBCombinedPacket *combined,
+                                       USBPacket *p)
+{
+    assert(p->combined == combined);
+    p->combined = NULL;
+    QTAILQ_REMOVE(&combined->packets, p, combined_entry);
+}
+
+/* Also handles completion of non combined packets for pipelined input eps */
+void usb_combined_input_packet_complete(USBDevice *dev, USBPacket *p)
+{
+    USBCombinedPacket *combined = p->combined;
+    USBEndpoint *ep = p->ep;
+    USBPacket *next;
+    enum { completing, complete, leftover };
+    int result, state = completing;
+    bool short_not_ok;
+
+    if (combined == NULL) {
+        usb_packet_complete_one(dev, p);
+        goto leave;
+    }
+
+    assert(combined->first == p && p == QTAILQ_FIRST(&combined->packets));
+
+    result = combined->first->result;
+    short_not_ok = QTAILQ_LAST(&combined->packets, packets_head)->short_not_ok;
+
+    QTAILQ_FOREACH_SAFE(p, &combined->packets, combined_entry, next) {
+        if (state == completing) {
+            /* Distribute data over uncombined packets */
+            if (result >= p->iov.size) {
+                p->result = p->iov.size;
+            } else {
+                /* Send short or error packet to complete the transfer */
+                p->result = result;
+                state = complete;
+            }
+            p->short_not_ok = short_not_ok;
+            usb_combined_packet_remove(combined, p);
+            usb_packet_complete_one(dev, p);
+            result -= p->result;
+        } else {
+            /* Remove any leftover packets from the queue */
+            state = leftover;
+            p->result = USB_RET_REMOVE_FROM_QUEUE;
+            dev->port->ops->complete(dev->port, p);
+        }
+    }
+    /*
+     * If we had leftover packets the hcd driver will have cancelled them
+     * and usb_combined_packet_cancel has already freed combined!
+     */
+    if (state != leftover) {
+        g_free(combined);
+    }
+leave:
+    /* Check if there are packets in the queue waiting for our completion */
+    usb_ep_combine_input_packets(ep);
+}
+
+/* May only be called for combined packets! */
+void usb_combined_packet_cancel(USBDevice *dev, USBPacket *p)
+{
+    USBCombinedPacket *combined = p->combined;
+    USBDeviceClass *klass = USB_DEVICE_GET_CLASS(dev);
+    assert(combined != NULL);
+    assert(klass->cancel_combined_packet != NULL);
+
+    usb_combined_packet_remove(combined, p);
+    if (p == combined->first) {
+        klass->cancel_combined_packet(dev, p);
+    }
+    if (QTAILQ_EMPTY(&combined->packets)) {
+        g_free(combined);
+    }
+}
+
+/*
+ * Large input transfers can get split into multiple input packets, this
+ * function recombines them, removing the short_not_ok checks which all but
+ * the last packet of such splits transfers have, thereby allowing input
+ * transfer pipelining (which we cannot do on short_not_ok transfers)
+ */
+void usb_ep_combine_input_packets(USBEndpoint *ep)
+{
+    USBPacket *p, *u, *next, *prev = NULL, *first = NULL;
+    USBPort *port = ep->dev->port;
+    USBDeviceClass *klass = USB_DEVICE_GET_CLASS(ep->dev);
+    int ret;
+
+    assert(ep->pipeline);
+    assert(ep->pid == USB_TOKEN_IN);
+    assert(klass->handle_combined_data != NULL);
+
+    QTAILQ_FOREACH_SAFE(p, &ep->queue, queue, next) {
+        /* Empty the queue on a halt */
+        if (ep->halted) {
+            p->result = USB_RET_REMOVE_FROM_QUEUE;
+            port->ops->complete(port, p);
+            continue;
+        }
+
+        /* Skip packets already submitted to the device */
+        if (p->state == USB_PACKET_ASYNC) {
+            prev = p;
+            continue;
+        }
+        usb_packet_check_state(p, USB_PACKET_QUEUED);
+
+        /*
+         * If the previous (combined) packet has the short_not_ok flag set
+         * stop, as we must not submit packets to the device after a transfer
+         * ending with short_not_ok packet.
+         */
+        if (prev && prev->short_not_ok) {
+            break;
+        }
+
+        if (first) {
+            if (first->combined == NULL) {
+                USBCombinedPacket *combined = g_new0(USBCombinedPacket, 1);
+
+                combined->first = first;
+                QTAILQ_INIT(&combined->packets);
+                qemu_iovec_init(&combined->iov, 2);
+                usb_combined_packet_add(combined, first);
+            }
+            usb_combined_packet_add(first->combined, p);
+        } else {
+            first = p;
+        }
+
+        /* Is this packet the last one of a (combined) transfer? */
+        if ((p->iov.size % ep->max_packet_size) != 0 || !p->short_not_ok ||
+                next == NULL) {
+            ret = klass->handle_combined_data(ep->dev, first);
+            assert(ret == USB_RET_ASYNC);
+            if (first->combined) {
+                QTAILQ_FOREACH(u, &first->combined->packets, combined_entry) {
+                    usb_packet_set_state(u, USB_PACKET_ASYNC);
+                }
+            } else {
+                usb_packet_set_state(first, USB_PACKET_ASYNC);
+            }
+            first = NULL;
+            prev = p;
+        }
+    }
+}
diff --git a/hw/usb/core.c b/hw/usb/core.c
index 87a513f..99b9fdb 100644
--- a/hw/usb/core.c
+++ b/hw/usb/core.c
@@ -544,6 +544,7 @@  void usb_packet_setup(USBPacket *p, int pid, USBEndpoint *ep, uint64_t id,
     p->parameter = 0;
     p->short_not_ok = short_not_ok;
     p->int_req = int_req;
+    p->combined = NULL;
     qemu_iovec_reset(&p->iov);
     usb_packet_set_state(p, USB_PACKET_SETUP);
 }