diff mbox series

Function xhci_bulk_tx( ) failed to handle double transfer EVENT TRB case in one call

Message ID AM6PR04MB5413CD5CFBE2D7B0597A034AF13F0@AM6PR04MB5413.eurprd04.prod.outlook.com
State New
Delegated to: Marek Vasut
Headers show
Series Function xhci_bulk_tx( ) failed to handle double transfer EVENT TRB case in one call | expand

Commit Message

Ran Wang Sept. 18, 2020, 11:13 a.m. UTC
Hello,
   On platform LS1028ARDB, we observed USB3.0/2.0 dongle failed to do tftp data transfer and caused system reset, I have some analysis below but still your input for help:

=>usb reset&&setenv ethact r8152_eth&&tftp a0000000 test_data
resetting USB...
Bus usb3@3100000: Register 200017f NbrPorts 2
Starting the controller
USB XHCI 1.00
Bus usb3@3110000: Register 200017f NbrPorts 2
Starting the controller
USB XHCI 1.00
scanning bus usb3@3100000 for devices...
Warning: r8152_eth MAC addresses don't match:
Address in ROM is          d4:6e:0e:0c:8d:14
Address in environment is  00:0e:c6:c9:5e:0f
2 USB Device(s) found
scanning bus usb3@3110000 for devices... 1 USB Device(s) found
       scanning usb for storage devices... 0 Storage Device(s) found
Waiting for Ethernet connection... done.
Using r8152_eth device
TFTP from server 10.192.208.233; our IP address is 10.193.21.125
Filename 'test_data'.
Load address: 0xa0000000
Loading: BUG at drivers/usb/host/xhci-ring.c:726/xhci_bulk_tx()!
BUG!
### ERROR ### Please RESET the board ###


Further debugging shows that is due to xhci_blk_tx() miss handling 2nd event TRB in one call, causing in next call (for different EP), it got that missed event TRB and found ep_index mismatched, then hold system.
A typical scenario is shown as below (with my debug prints added):

dev=00000000fbb4f3c0, pipe=c0010283, buffer=00000000fbb4fd80, length=2048----------------------------buffer would cross 64KB boundary, so we will send request in more than 1 TRB, this is the key issue trigger condition
xhci_bulk_tx()#0.1.running_total:0x280
xhci_bulk_tx()#0.2.trb_buff_len:0x280
xhci_bulk_tx()#0.3.running_total:0x280
xhci_bulk_tx()#0.4.num_trbs:0x2--------------------------2 Transfer TRB
xhci_bulk_tx()#0.5.running_total:0x10280
xhci_bulk_tx()#0.start_trb:0x00000000fbb47140
----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47140----------Assemble 1st Transfer TRB
xhci_bulk_tx()#0.addr:0xfbb4fd80
xhci_bulk_tx()#0.trb_buff_len:0x280
xhci_bulk_tx()#0.running_total:0x280
xhci_bulk_tx()#0.length:0x800
xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000
----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47150----------Assemble 2nd Transfer TRB
xhci_bulk_tx()#0.addr:0xfbb50000
xhci_bulk_tx()#0.trb_buff_len:0x580
xhci_bulk_tx()#0.running_total:0x800
xhci_bulk_tx()#0.length:0x800
xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000
xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb47140----------------handle event TRB for 1st Transfer TRB
xhci_bulk_tx()#0.event->trans_event.transfer_len:0xd000180
xhci_bulk_tx()#0.event->trans_event.flags:0x1058001
xhci_bulk_tx()#0.event->len:384
xhci_bulk_tx()#0.event->complete_code:13
xhci_bulk_tx()#1.field:0x1058001
xhci_bulk_tx()#2.TRB_TO_EP_ENDEX(field):0x4
xhci_bulk_tx()#3.ep_index:0x4----------------------------------------------------------xhci_bulk_tx() return

dev=00000000fbb4f3c0, pipe=c0018203, buffer=00000000fbb2f9c0, length=110----------an other call comming, and for different EP (ep_index = 05)
xhci_bulk_tx()#0.1.running_total:0x640
xhci_bulk_tx()#0.2.trb_buff_len:0x640
xhci_bulk_tx()#0.3.running_total:0x640
xhci_bulk_tx()#0.4.num_trbs:0x1
xhci_bulk_tx()#0.5.running_total:0x640
xhci_bulk_tx()#0.start_trb:0x00000000fbb47610
----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47610---------queue 1st transfer TRB
xhci_bulk_tx()#0.addr:0xfbb2f9c0
xhci_bulk_tx()#0.trb_buff_len:0x6e
xhci_bulk_tx()#0.running_total:0x6e
xhci_bulk_tx()#0.length:0x6e
xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000
xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb47150------------hand event TRB, note that this buffer is for last Transfer TRB in last call (see above '2nd Transfer TRB')
xhci_bulk_tx()#0.event->trans_event.transfer_len:0x1000000
xhci_bulk_tx()#0.event->trans_event.flags:0x1058000
xhci_bulk_tx()#0.event->len:0
xhci_bulk_tx()#0.event->complete_code:1
xhci_bulk_tx()#1.field:0x1058000
xhci_bulk_tx()#2.TRB_TO_EP_ENDEX(field):0x4----------------ep_index is 4 rather than 5, then cause BUG()
xhci_bulk_tx()#3.ep_index:0x5-------------------------------------
ep_index mis-match wait for event again!!!!!!!!!----------------------------------
BUG at drivers/usb/host/xhci-ring.c:759/xhci_bulk_tx()!
BUG!
### ERROR ### Please RESET the board ###


Basing on above analysis, I ever tried to remove below code from this function, issue disappear, which prove my assumption:


So, any comment? Thanks in advance.

Regards,
Ran
diff mbox series

Patch

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 092ed6e..28197ef 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -591,14 +591,7 @@  int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe,
        ep_ctx = xhci_get_ep_ctx(ctrl, virt_dev->out_ctx, ep_index);
 
        ring = virt_dev->eps[ep_index].ring;
-       /*
-        * How much data is (potentially) left before the 64KB boundary?
-        * XHCI Spec puts restriction( TABLE 49 and 6.4.1 section of XHCI Spec)
-        * that the buffer should not span 64KB boundary. if so
-        * we send request in more than 1 TRB by chaining them.
-        */
-       running_total = TRB_MAX_BUFF_SIZE -
-                       (lower_32_bits(val_64) & (TRB_MAX_BUFF_SIZE - 1));
+       running_total = length;
        trb_buff_len = running_total;
        running_total &= TRB_MAX_BUFF_SIZE - 1;


Besides that, I also ever tried to let xhci_bulk_tx() repeatedly handle multiple EVENT TRB in one call. However, things works for several rounds but end up with event TRB Ring hang finally, reason unknow.
- Log:
dev=00000000fbb4f3c0, pipe=c0010283, buffer=00000000fbb4fd80, length=2048
xhci_bulk_tx()#0.1.running_total:0x280
xhci_bulk_tx()#0.2.trb_buff_len:0x280
xhci_bulk_tx()#0.3.running_total:0x280
xhci_bulk_tx()#0.4.num_trbs:0x2
xhci_bulk_tx()#0.5.running_total:0x10280
xhci_bulk_tx()#0.start_trb:0x00000000fbb47160
----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47160---------------1st Transfer TRB
----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47170-------------2nd TransferTRB
xhci_bulk_tx()#0.num_trbs_count:2
num_trbs_count > 1!!!!!!!!!--------------------------------------Here I add code to handle another event TRB
xhci_bulk_tx()#0.num_trbs_count:1
xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb47170----------------------Now we got the final event TRB
xhci_bulk_tx()#0.event->trans_event.transfer_len:0x1000000
xhci_bulk_tx()#0.event->trans_event.flags:0x1058000
xhci_bulk_tx()#0.event->len:0
xhci_bulk_tx()#0.event->complete_code:1
xhci_bulk_tx()#1.field:0x1058000
xhci_bulk_tx()#2.TRB_TO_EP_ENDEX(field):0x4
xhci_bulk_tx()#3.ep_index:0x4
xhci_bulk_tx()#4.ret:0-------------------------------------Return success
...----------------------------------------------------------------This success can last for several round call for the same EP
dev=00000000fbb4f3c0, pipe=c0010283, buffer=00000000fbb4fd80, length=2048--------------The last call when event Ring hang
xhci_bulk_tx()#0.1.running_total:0x280
xhci_bulk_tx()#0.2.trb_buff_len:0x280
xhci_bulk_tx()#0.3.running_total:0x280
xhci_bulk_tx()#0.4.num_trbs:0x2
xhci_bulk_tx()#0.5.running_total:0x10280
xhci_bulk_tx()#0.start_trb:0x00000000fbb47180
----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47180
xhci_bulk_tx()#0.addr:0xfbb4fd80
xhci_bulk_tx()#0.trb_buff_len:0x280
xhci_bulk_tx()#0.running_total:0x280
xhci_bulk_tx()#0.length:0x800
xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000
----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47190
xhci_bulk_tx()#0.addr:0xfbb50000
xhci_bulk_tx()#0.trb_buff_len:0x580
xhci_bulk_tx()#0.running_total:0x800
xhci_bulk_tx()#0.length:0x800
xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000
xhci_bulk_tx()#0.num_trbs_count:2
num_trbs_count > 1!!!!!!!!!
XHCI bulk transfer timed out, aborting..................................
Rx: failed to receive: -5
T

- Related hacking:
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 092ed6e..0b4da26 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -558,6 +558,7 @@  int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe,
                        int length, void *buffer)
 {
        int num_trbs = 0;
+       int num_trbs_count = 0;
        struct xhci_generic_trb *start_trb;
        bool first_trb = false;
        int start_cycle;
@@ -614,6 +615,7 @@  int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe,
                num_trbs++;
                running_total += TRB_MAX_BUFF_SIZE;
        }
+       num_trbs_count = num_trbs;
 
        /*
         * XXX: Calling routine prepare_ring() called in place of
@@ -713,7 +715,7 @@  int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe,
        } while (running_total < length);
 
        giveback_first_trb(udev, ep_index, start_cycle, start_trb);
-
+again:
        event = xhci_wait_for_event(ctrl, TRB_TRANSFER);
        if (!event) {
                debug("XHCI bulk transfer timed out, aborting...\n");
@@ -722,6 +724,12 @@  int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe,
                udev->act_len = 0;
                return -ETIMEDOUT;
        }
+       if(num_trbs_count-- > 1) {
+               printf("num_trbs_count > 1!!!!!!!!!\n");
+               xhci_acknowledge_event(ctrl);
+               goto again;
+       }
+
        field = le32_to_cpu(event->trans_event.flags);
 
        BUG_ON(TRB_TO_SLOT_ID(field) != slot_id);