Message ID | 20200922045422.27973-1-ran.wang_1@nxp.com |
---|---|
State | Superseded |
Delegated to: | Marek Vasut |
Headers | show |
Series | usb: xhci: fix event trb handling missed | expand |
On 9/22/20 6:54 AM, Ran Wang wrote: > In functiion xhci_bulk_tx(), when buffer cross 64KB boundary, it will > send request in more than 1 Transfer TRB by chaining them, but then handle > only 1 event TRB to mark request completed. > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe xhci > controller will generated more than 1 event TRB sometimes, this cause that > function mishandle event TRB in next round call, then system hang due to > BUG() checking. > > This patch adds a loop to make sure the event TRB for last Transfer TRB has > to be handled in time. Bin, can you please take a look at this one ? Thanks
On Tue, Sep 22, 2020 at 7:41 PM Marek Vasut <marex@denx.de> wrote: > > On 9/22/20 6:54 AM, Ran Wang wrote: > > In functiion xhci_bulk_tx(), when buffer cross 64KB boundary, it will > > send request in more than 1 Transfer TRB by chaining them, but then handle > > only 1 event TRB to mark request completed. > > > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe xhci > > controller will generated more than 1 event TRB sometimes, this cause that > > function mishandle event TRB in next round call, then system hang due to > > BUG() checking. > > > > This patch adds a loop to make sure the event TRB for last Transfer TRB has > > to be handled in time. > > Bin, can you please take a look at this one ? Thanks Sorry I missed this one. Will take a look at this soon. Regards, Bin
Hi Bin, On Friday, November 6, 2020 3:54 PM, Bin Meng wrote: > To: Marek Vasut <marex@denx.de> > Cc: Ran Wang <ran.wang_1@nxp.com>; U-Boot Mailing List > <u-boot@lists.denx.de> > Subject: Re: [PATCH] usb: xhci: fix event trb handling missed > > On Tue, Sep 22, 2020 at 7:41 PM Marek Vasut <marex@denx.de> wrote: > > > > On 9/22/20 6:54 AM, Ran Wang wrote: > > > In functiion xhci_bulk_tx(), when buffer cross 64KB boundary, it > > > will send request in more than 1 Transfer TRB by chaining them, but > > > then handle only 1 event TRB to mark request completed. > > > > > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe > > > xhci controller will generated more than 1 event TRB sometimes, this > > > cause that function mishandle event TRB in next round call, then > > > system hang due to > > > BUG() checking. > > > > > > This patch adds a loop to make sure the event TRB for last Transfer > > > TRB has to be handled in time. > > > > Bin, can you please take a look at this one ? Thanks > > Sorry I missed this one. Will take a look at this soon. Kindly remind that this patch is to resolve issue I ever sent to community for help: https://lists.denx.de/pipermail/u-boot/2020-September/426981.html Thanks & Regards, Ran
Hi Ran, On Tue, Sep 22, 2020 at 1:02 PM Ran Wang <ran.wang_1@nxp.com> wrote: > > In functiion xhci_bulk_tx(), when buffer cross 64KB boundary, it will typo: function > send request in more than 1 Transfer TRB by chaining them, but then handle > only 1 event TRB to mark request completed. > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe xhci > controller will generated more than 1 event TRB sometimes, this cause that I am not sure if it's fair to say this is Layerscape specific behavior. Based on the xHCI spec, the spec indicates 1 event trb should be generated when IOC/ISP flag is set to 1 or an error occurs. I will see if I can reproduce your issue on an x86 board. > function mishandle event TRB in next round call, then system hang due to > BUG() checking. > > This patch adds a loop to make sure the event TRB for last Transfer TRB has > to be handled in time. > > Signed-off-by: Ran Wang <ran.wang_1@nxp.com> > --- > drivers/usb/host/xhci-ring.c | 17 ++++++++++++++--- > 1 file changed, 14 insertions(+), 3 deletions(-) > > diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c > index 092ed6e..d77e058 100644 > --- a/drivers/usb/host/xhci-ring.c > +++ b/drivers/usb/host/xhci-ring.c > @@ -578,10 +578,13 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe, > int ret; > u32 trb_fields[4]; > u64 val_64 = virt_to_phys(buffer); > + void *last_transfer_trb_addr; > + int available_length; > > debug("dev=%p, pipe=%lx, buffer=%p, length=%d\n", > udev, pipe, buffer, length); > > + available_length = length; > ep_index = usb_pipe_ep_index(pipe); > virt_dev = ctrl->devs[slot_id]; > > @@ -701,7 +704,7 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe, > trb_fields[2] = length_field; > trb_fields[3] = field | (TRB_NORMAL << TRB_TYPE_SHIFT); > > - queue_trb(ctrl, ring, (num_trbs > 1), trb_fields); > + last_transfer_trb_addr = queue_trb(ctrl, ring, (num_trbs > 1), trb_fields); > > --num_trbs; > > @@ -714,6 +717,7 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe, > > giveback_first_trb(udev, ep_index, start_cycle, start_trb); > > +again: > event = xhci_wait_for_event(ctrl, TRB_TRANSFER); > if (!event) { > debug("XHCI bulk transfer timed out, aborting...\n"); > @@ -722,14 +726,21 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe, > udev->act_len = 0; > return -ETIMEDOUT; > } > - field = le32_to_cpu(event->trans_event.flags); > > + if ((void *)event->trans_event.buffer != last_transfer_trb_addr) { This should be: if ((void *)le64_to_cpu(event->trans_event.buffer) != last_transfer_trb_addr) > + available_length -= > + (int)EVENT_TRB_LEN(le32_to_cpu(event->trans_event.transfer_len)); > + xhci_acknowledge_event(ctrl); > + goto again; > + } > + > + field = le32_to_cpu(event->trans_event.flags); > BUG_ON(TRB_TO_SLOT_ID(field) != slot_id); > BUG_ON(TRB_TO_EP_INDEX(field) != ep_index); > BUG_ON(*(void **)(uintptr_t)le64_to_cpu(event->trans_event.buffer) - > buffer > (size_t)length); > > - record_transfer_result(udev, event, length); > + record_transfer_result(udev, event, available_length); > xhci_acknowledge_event(ctrl); > xhci_inval_cache((uintptr_t)buffer, length); > Regards, Bin
Hi Ran, On Tue, Nov 10, 2020 at 1:30 PM Bin Meng <bmeng.cn@gmail.com> wrote: > > Hi Ran, > > On Tue, Sep 22, 2020 at 1:02 PM Ran Wang <ran.wang_1@nxp.com> wrote: > > > > In functiion xhci_bulk_tx(), when buffer cross 64KB boundary, it will > > typo: function > > > send request in more than 1 Transfer TRB by chaining them, but then handle > > only 1 event TRB to mark request completed. > > > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe xhci > > controller will generated more than 1 event TRB sometimes, this cause that > > I am not sure if it's fair to say this is Layerscape specific > behavior. Based on the xHCI spec, the spec indicates 1 event trb > should be generated when IOC/ISP flag is set to 1 or an error occurs. Ah, ISP flag is set if the pipe is from an IN endpoint. Currently we have: /* Only set interrupt on short packet for IN endpoints */ if (usb_pipein(pipe)) field |= TRB_ISP; Can you verify that if removing the above codes, and without your changes in this patch, the original issue can be resolved on LS1028? > I will see if I can reproduce your issue on an x86 board. > Note this patch does not apply on top of u-boot/master. Please rebase. > > function mishandle event TRB in next round call, then system hang due to > > BUG() checking. > > > > This patch adds a loop to make sure the event TRB for last Transfer TRB has > > to be handled in time. > > > > Signed-off-by: Ran Wang <ran.wang_1@nxp.com> > > --- > > drivers/usb/host/xhci-ring.c | 17 ++++++++++++++--- > > 1 file changed, 14 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c > > index 092ed6e..d77e058 100644 > > --- a/drivers/usb/host/xhci-ring.c > > +++ b/drivers/usb/host/xhci-ring.c > > @@ -578,10 +578,13 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe, > > int ret; > > u32 trb_fields[4]; > > u64 val_64 = virt_to_phys(buffer); > > + void *last_transfer_trb_addr; > > + int available_length; > > > > debug("dev=%p, pipe=%lx, buffer=%p, length=%d\n", > > udev, pipe, buffer, length); > > > > + available_length = length; > > ep_index = usb_pipe_ep_index(pipe); > > virt_dev = ctrl->devs[slot_id]; > > > > @@ -701,7 +704,7 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe, > > trb_fields[2] = length_field; > > trb_fields[3] = field | (TRB_NORMAL << TRB_TYPE_SHIFT); > > > > - queue_trb(ctrl, ring, (num_trbs > 1), trb_fields); > > + last_transfer_trb_addr = queue_trb(ctrl, ring, (num_trbs > 1), trb_fields); > > > > --num_trbs; > > > > @@ -714,6 +717,7 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe, > > > > giveback_first_trb(udev, ep_index, start_cycle, start_trb); > > > > +again: > > event = xhci_wait_for_event(ctrl, TRB_TRANSFER); > > if (!event) { > > debug("XHCI bulk transfer timed out, aborting...\n"); > > @@ -722,14 +726,21 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe, > > udev->act_len = 0; > > return -ETIMEDOUT; > > } > > - field = le32_to_cpu(event->trans_event.flags); > > > > + if ((void *)event->trans_event.buffer != last_transfer_trb_addr) { > > This should be: > > if ((void *)le64_to_cpu(event->trans_event.buffer) != last_transfer_trb_addr) > > > + available_length -= > > + (int)EVENT_TRB_LEN(le32_to_cpu(event->trans_event.transfer_len)); > > + xhci_acknowledge_event(ctrl); > > + goto again; > > + } > > + > > + field = le32_to_cpu(event->trans_event.flags); > > BUG_ON(TRB_TO_SLOT_ID(field) != slot_id); > > BUG_ON(TRB_TO_EP_INDEX(field) != ep_index); > > BUG_ON(*(void **)(uintptr_t)le64_to_cpu(event->trans_event.buffer) - > > buffer > (size_t)length); > > > > - record_transfer_result(udev, event, length); > > + record_transfer_result(udev, event, available_length); > > xhci_acknowledge_event(ctrl); > > xhci_inval_cache((uintptr_t)buffer, length); > > Regards, Bin
Hi Bin, On Tuesday, November 10, 2020 1:43 PM Bin Meng wrote: > > Hi Ran, > > On Tue, Nov 10, 2020 at 1:30 PM Bin Meng <bmeng.cn@gmail.com> wrote: > > > > Hi Ran, > > > > On Tue, Sep 22, 2020 at 1:02 PM Ran Wang <ran.wang_1@nxp.com> wrote: > > > > > > In functiion xhci_bulk_tx(), when buffer cross 64KB boundary, it > > > will > > > > typo: function Got it. > > > send request in more than 1 Transfer TRB by chaining them, but then > > > handle only 1 event TRB to mark request completed. > > > > > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe > > > xhci controller will generated more than 1 event TRB sometimes, this > > > cause that > > > > I am not sure if it's fair to say this is Layerscape specific > > behavior. Based on the xHCI spec, the spec indicates 1 event trb > > should be generated when IOC/ISP flag is set to 1 or an error occurs. > > Ah, ISP flag is set if the pipe is from an IN endpoint. Currently we have: > > /* Only set interrupt on short packet for IN endpoints */ > if (usb_pipein(pipe)) > field |= TRB_ISP; > > Can you verify that if removing the above codes, and without your changes in > this patch, the original issue can be resolved on LS1028? Bingo, removing above codes can resolve my issue, too > > I will see if I can reproduce your issue on an x86 board. > > > > Note this patch does not apply on top of u-boot/master. Please rebase. Sure, I can rebase my patch, but I am nor sure my solution is still worth: xHCI says "The detection of a USB Short Packet (i.e. the actual number of bytes received was less than the expected number of bytes defined by the Transfer TRB) during a transfer does not necessarily generate an Event. " So does it mean above workaround you suggest would not violate spec, either (although current Linux kernel driver still set ISP for this case, but may have a more robust mechanism for event TRB handling)? > > > function mishandle event TRB in next round call, then system hang > > > due to > > > BUG() checking. > > > > > > This patch adds a loop to make sure the event TRB for last Transfer > > > TRB has to be handled in time. > > > > > > Signed-off-by: Ran Wang <ran.wang_1@nxp.com> > > > --- > > > drivers/usb/host/xhci-ring.c | 17 ++++++++++++++--- > > > 1 file changed, 14 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/usb/host/xhci-ring.c > > > b/drivers/usb/host/xhci-ring.c index 092ed6e..d77e058 100644 > > > --- a/drivers/usb/host/xhci-ring.c > > > +++ b/drivers/usb/host/xhci-ring.c > > > @@ -578,10 +578,13 @@ int xhci_bulk_tx(struct usb_device *udev, > unsigned long pipe, > > > int ret; > > > u32 trb_fields[4]; > > > u64 val_64 = virt_to_phys(buffer); > > > + void *last_transfer_trb_addr; > > > + int available_length; > > > > > > debug("dev=%p, pipe=%lx, buffer=%p, length=%d\n", > > > udev, pipe, buffer, length); > > > > > > + available_length = length; > > > ep_index = usb_pipe_ep_index(pipe); > > > virt_dev = ctrl->devs[slot_id]; > > > > > > @@ -701,7 +704,7 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned > long pipe, > > > trb_fields[2] = length_field; > > > trb_fields[3] = field | (TRB_NORMAL << > > > TRB_TYPE_SHIFT); > > > > > > - queue_trb(ctrl, ring, (num_trbs > 1), trb_fields); > > > + last_transfer_trb_addr = queue_trb(ctrl, ring, > > > + (num_trbs > 1), trb_fields); > > > > > > --num_trbs; > > > > > > @@ -714,6 +717,7 @@ int xhci_bulk_tx(struct usb_device *udev, > > > unsigned long pipe, > > > > > > giveback_first_trb(udev, ep_index, start_cycle, start_trb); > > > > > > +again: > > > event = xhci_wait_for_event(ctrl, TRB_TRANSFER); > > > if (!event) { > > > debug("XHCI bulk transfer timed out, > > > aborting...\n"); @@ -722,14 +726,21 @@ int xhci_bulk_tx(struct > usb_device *udev, unsigned long pipe, > > > udev->act_len = 0; > > > return -ETIMEDOUT; > > > } > > > - field = le32_to_cpu(event->trans_event.flags); > > > > > > + if ((void *)event->trans_event.buffer != > > > + last_transfer_trb_addr) { > > > > This should be: > > > > if ((void *)le64_to_cpu(event->trans_event.buffer) != > > last_transfer_trb_addr) Got it, will update this in next version (if you think my solution is still acceptable for this issue). Thanks & regards, Ran > > > + available_length -= > > > + > (int)EVENT_TRB_LEN(le32_to_cpu(event->trans_event.transfer_len)); > > > + xhci_acknowledge_event(ctrl); > > > + goto again; > > > + } > > > + > > > + field = le32_to_cpu(event->trans_event.flags); > > > BUG_ON(TRB_TO_SLOT_ID(field) != slot_id); > > > BUG_ON(TRB_TO_EP_INDEX(field) != ep_index); > > > BUG_ON(*(void > **)(uintptr_t)le64_to_cpu(event->trans_event.buffer) - > > > buffer > (size_t)length); > > > > > > - record_transfer_result(udev, event, length); > > > + record_transfer_result(udev, event, available_length); > > > xhci_acknowledge_event(ctrl); > > > xhci_inval_cache((uintptr_t)buffer, length); > > > > > Regards, > Bin
Hi Ran, On Tue, Nov 10, 2020 at 3:36 PM Ran Wang <ran.wang_1@nxp.com> wrote: > > Hi Bin, > > On Tuesday, November 10, 2020 1:43 PM Bin Meng wrote: > > > > Hi Ran, > > > > On Tue, Nov 10, 2020 at 1:30 PM Bin Meng <bmeng.cn@gmail.com> wrote: > > > > > > Hi Ran, > > > > > > On Tue, Sep 22, 2020 at 1:02 PM Ran Wang <ran.wang_1@nxp.com> wrote: > > > > > > > > In functiion xhci_bulk_tx(), when buffer cross 64KB boundary, it > > > > will > > > > > > typo: function > > Got it. > > > > > send request in more than 1 Transfer TRB by chaining them, but then > > > > handle only 1 event TRB to mark request completed. > > > > > > > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe > > > > xhci controller will generated more than 1 event TRB sometimes, this > > > > cause that > > > > > > I am not sure if it's fair to say this is Layerscape specific > > > behavior. Based on the xHCI spec, the spec indicates 1 event trb > > > should be generated when IOC/ISP flag is set to 1 or an error occurs. > > > > Ah, ISP flag is set if the pipe is from an IN endpoint. Currently we have: > > > > /* Only set interrupt on short packet for IN endpoints */ > > if (usb_pipein(pipe)) > > field |= TRB_ISP; > > > > Can you verify that if removing the above codes, and without your changes in > > this patch, the original issue can be resolved on LS1028? > > Bingo, removing above codes can resolve my issue, too Thank you for your testing. > > > > I will see if I can reproduce your issue on an x86 board. > > > > > > > Note this patch does not apply on top of u-boot/master. Please rebase. > > Sure, I can rebase my patch, but I am nor sure my solution is still worth: > xHCI says "The detection of a USB Short Packet (i.e. the actual number of > bytes received was less than the expected number of bytes defined by the > Transfer TRB) during a transfer does not necessarily generate an Event. " > Yes, that's what the spec says. So in your case, can you add some logs to verify that there is a transfer event trb generated by the xHC and the completion code is 13 (short packet)? You can add some debug output in record_transfer_result(). > So does it mean above workaround you suggest would not violate spec, either > (although current Linux kernel driver still set ISP for this case, but may have > a more robust mechanism for event TRB handling)? I need to do more testing to see if this is a LS1028 behavior or generic xHCI behavior, ie: on an x86 board. Do you happen to know if there is any errata document for LS1028 that is related to this? > > > > > > function mishandle event TRB in next round call, then system hang > > > > due to > > > > BUG() checking. > > > > > > > > This patch adds a loop to make sure the event TRB for last Transfer > > > > TRB has to be handled in time. > > > > > > > > Signed-off-by: Ran Wang <ran.wang_1@nxp.com> > > > > --- > > > > drivers/usb/host/xhci-ring.c | 17 ++++++++++++++--- > > > > 1 file changed, 14 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/drivers/usb/host/xhci-ring.c > > > > b/drivers/usb/host/xhci-ring.c index 092ed6e..d77e058 100644 > > > > --- a/drivers/usb/host/xhci-ring.c > > > > +++ b/drivers/usb/host/xhci-ring.c > > > > @@ -578,10 +578,13 @@ int xhci_bulk_tx(struct usb_device *udev, > > unsigned long pipe, > > > > int ret; > > > > u32 trb_fields[4]; > > > > u64 val_64 = virt_to_phys(buffer); > > > > + void *last_transfer_trb_addr; > > > > + int available_length; > > > > > > > > debug("dev=%p, pipe=%lx, buffer=%p, length=%d\n", > > > > udev, pipe, buffer, length); > > > > > > > > + available_length = length; > > > > ep_index = usb_pipe_ep_index(pipe); > > > > virt_dev = ctrl->devs[slot_id]; > > > > > > > > @@ -701,7 +704,7 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned > > long pipe, > > > > trb_fields[2] = length_field; > > > > trb_fields[3] = field | (TRB_NORMAL << > > > > TRB_TYPE_SHIFT); > > > > > > > > - queue_trb(ctrl, ring, (num_trbs > 1), trb_fields); > > > > + last_transfer_trb_addr = queue_trb(ctrl, ring, > > > > + (num_trbs > 1), trb_fields); > > > > > > > > --num_trbs; > > > > > > > > @@ -714,6 +717,7 @@ int xhci_bulk_tx(struct usb_device *udev, > > > > unsigned long pipe, > > > > > > > > giveback_first_trb(udev, ep_index, start_cycle, start_trb); > > > > > > > > +again: > > > > event = xhci_wait_for_event(ctrl, TRB_TRANSFER); > > > > if (!event) { > > > > debug("XHCI bulk transfer timed out, > > > > aborting...\n"); @@ -722,14 +726,21 @@ int xhci_bulk_tx(struct > > usb_device *udev, unsigned long pipe, > > > > udev->act_len = 0; > > > > return -ETIMEDOUT; > > > > } > > > > - field = le32_to_cpu(event->trans_event.flags); > > > > > > > > + if ((void *)event->trans_event.buffer != > > > > + last_transfer_trb_addr) { > > > > > > This should be: > > > > > > if ((void *)le64_to_cpu(event->trans_event.buffer) != > > > last_transfer_trb_addr) > > Got it, will update this in next version (if you think my solution is still acceptable for this issue). > > Thanks & regards, > Ran > > > > > + available_length -= > > > > + > > (int)EVENT_TRB_LEN(le32_to_cpu(event->trans_event.transfer_len)); > > > > + xhci_acknowledge_event(ctrl); > > > > + goto again; > > > > + } > > > > + > > > > + field = le32_to_cpu(event->trans_event.flags); > > > > BUG_ON(TRB_TO_SLOT_ID(field) != slot_id); > > > > BUG_ON(TRB_TO_EP_INDEX(field) != ep_index); > > > > BUG_ON(*(void > > **)(uintptr_t)le64_to_cpu(event->trans_event.buffer) - > > > > buffer > (size_t)length); > > > > > > > > - record_transfer_result(udev, event, length); > > > > + record_transfer_result(udev, event, available_length); > > > > xhci_acknowledge_event(ctrl); > > > > xhci_inval_cache((uintptr_t)buffer, length); > > > > Regards, Bin
Hi Bin, On Tuesday, November 10, 2020 3:47 PM Bin Meng wrote: > > Hi Ran, <snip> > > > > > send request in more than 1 Transfer TRB by chaining them, but > > > > > then handle only 1 event TRB to mark request completed. > > > > > > > > > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we > > > > > observe xhci controller will generated more than 1 event TRB > > > > > sometimes, this cause that > > > > > > > > I am not sure if it's fair to say this is Layerscape specific > > > > behavior. Based on the xHCI spec, the spec indicates 1 event trb > > > > should be generated when IOC/ISP flag is set to 1 or an error occurs. > > > > > > Ah, ISP flag is set if the pipe is from an IN endpoint. Currently we have: > > > > > > /* Only set interrupt on short packet for IN endpoints */ > > > if (usb_pipein(pipe)) > > > field |= TRB_ISP; > > > > > > Can you verify that if removing the above codes, and without your > > > changes in this patch, the original issue can be resolved on LS1028? > > > > Bingo, removing above codes can resolve my issue, too > > Thank you for your testing. > > > > > > > I will see if I can reproduce your issue on an x86 board. > > > > > > > > > > Note this patch does not apply on top of u-boot/master. Please rebase. > > > > Sure, I can rebase my patch, but I am nor sure my solution is still worth: > > xHCI says "The detection of a USB Short Packet (i.e. the actual number > > of bytes received was less than the expected number of bytes defined > > by the Transfer TRB) during a transfer does not necessarily generate an Event. > " > > > > Yes, that's what the spec says. So in your case, can you add some logs to verify > that there is a transfer event trb generated by the xHC and the completion > code is 13 (short packet)? You can add some debug output in > record_transfer_result(). Sure, let me try this can get back to you later. > > So does it mean above workaround you suggest would not violate spec, > > either (although current Linux kernel driver still set ISP for this > > case, but may have a more robust mechanism for event TRB handling)? > > I need to do more testing to see if this is a LS1028 behavior or generic xHCI > behavior, ie: on an x86 board. Do you happen to know if there is any errata > document for LS1028 that is related to this? I don’t see any errata for LS1028A for this case. Actually this issue only randomly happens in U-Boot + USB dongle (I have tried 3 different brands, such as TP-LINK UE300, etc). And U-Boot + USB mass storage devicecannot reproduce this issue. Under Linux, no such issue observed neither (with those dongles), all of them works fine. That's why I decide to improve event TRB handling logic here. Below is the U-Boot test commands I used (repeatedly do tftp file transfer), issue would happen in less 5 minutes. for your reference: setenv tt 'usb reset&&setenv ethact asix_eth&&tftp a0000000 tmp/1MB_test_file' setenv ttt 'run tt&&run tt&&run tt&&run tt&&run tt&&run tt&&run tt&&run tt&&run tt&&run tt' setenv tttt 'run ttt&&run ttt&&run ttt&&run ttt&&run ttt&&run ttt&&run ttt&&run ttt&&run ttt' setenv u 'run tttt&&run tttt&&run tttt&&run tttt&&run tttt&&run tttt&&run tttt&&run tttt&&run tttt&&run tttt&&run tttt' run u Regards, Ran
Hi Bin, On Tuesday, November 10, 2020 4:06 PM Ran Wang wrote: > Hi Bin, > > On Tuesday, November 10, 2020 3:47 PM Bin Meng wrote: > > > > Hi Ran, > <snip> > > > > > > send request in more than 1 Transfer TRB by chaining them, but > > > > > > then handle only 1 event TRB to mark request completed. > > > > > > > > > > > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we > > > > > > observe xhci controller will generated more than 1 event TRB > > > > > > sometimes, this cause that > > > > > > > > > > I am not sure if it's fair to say this is Layerscape specific > > > > > behavior. Based on the xHCI spec, the spec indicates 1 event trb > > > > > should be generated when IOC/ISP flag is set to 1 or an error occurs. > > > > > > > > Ah, ISP flag is set if the pipe is from an IN endpoint. Currently we have: > > > > > > > > /* Only set interrupt on short packet for IN endpoints */ > > > > if (usb_pipein(pipe)) > > > > field |= TRB_ISP; > > > > > > > > Can you verify that if removing the above codes, and without your > > > > changes in this patch, the original issue can be resolved on LS1028? > > > > > > Bingo, removing above codes can resolve my issue, too > > > > Thank you for your testing. > > > > > > > > > > I will see if I can reproduce your issue on an x86 board. > > > > > > > > > > > > > Note this patch does not apply on top of u-boot/master. Please rebase. > > > > > > Sure, I can rebase my patch, but I am nor sure my solution is still worth: > > > xHCI says "The detection of a USB Short Packet (i.e. the actual > > > number of bytes received was less than the expected number of bytes > > > defined by the Transfer TRB) during a transfer does not necessarily generate > an Event. > > " > > > > > > > Yes, that's what the spec says. So in your case, can you add some logs > > to verify that there is a transfer event trb generated by the xHC and > > the completion code is 13 (short packet)? You can add some debug > > output in record_transfer_result(). > > Sure, let me try this can get back to you later. Confirm completion code is 13 when issue happen (please search ' complete_code:13'): dev=00000000fbb4f3c0, pipe=c0010283, buffer=00000000fbb4fd80, length=2048----------------------------buffer would cross 64KB boundary, so we will send request in more than 1 TRB, this is the key issue trigger condition xhci_bulk_tx()#0.1.running_total:0x280 xhci_bulk_tx()#0.2.trb_buff_len:0x280 xhci_bulk_tx()#0.3.running_total:0x280 xhci_bulk_tx()#0.4.num_trbs:0x2--------------------------2 Transfer TRB xhci_bulk_tx()#0.5.running_total:0x10280 xhci_bulk_tx()#0.start_trb:0x00000000fbb47140 ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47140----------Assemble 1st Transfer TRB xhci_bulk_tx()#0.addr:0xfbb4fd80 xhci_bulk_tx()#0.trb_buff_len:0x280 xhci_bulk_tx()#0.running_total:0x280 xhci_bulk_tx()#0.length:0x800 xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000 ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47150----------Assemble 2nd Transfer TRB xhci_bulk_tx()#0.addr:0xfbb50000 xhci_bulk_tx()#0.trb_buff_len:0x580 xhci_bulk_tx()#0.running_total:0x800 xhci_bulk_tx()#0.length:0x800 xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000 xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb47140----------------handle event TRB for 1st Transfer TRB xhci_bulk_tx()#0.event->trans_event.transfer_len:0xd000180 xhci_bulk_tx()#0.event->trans_event.flags:0x1058001 xhci_bulk_tx()#0.event->len:384 xhci_bulk_tx()#0.event->complete_code:13 xhci_bulk_tx()#1.field:0x1058001 xhci_bulk_tx()#2.TRB_TO_EP_ENDEX(field):0x4 xhci_bulk_tx()#3.ep_index:0x4----------------------------------------------------------xhci_bulk_tx() return dev=00000000fbb4f3c0, pipe=c0018203, buffer=00000000fbb2f9c0, length=110----------an other call comming, and for different EP (ep_index = 05) xhci_bulk_tx()#0.1.running_total:0x640 xhci_bulk_tx()#0.2.trb_buff_len:0x640 xhci_bulk_tx()#0.3.running_total:0x640 xhci_bulk_tx()#0.4.num_trbs:0x1 xhci_bulk_tx()#0.5.running_total:0x640 xhci_bulk_tx()#0.start_trb:0x00000000fbb47610 ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47610---------queue 1st transfer TRB xhci_bulk_tx()#0.addr:0xfbb2f9c0 xhci_bulk_tx()#0.trb_buff_len:0x6e xhci_bulk_tx()#0.running_total:0x6e xhci_bulk_tx()#0.length:0x6e xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000 xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb47150------------hand event TRB, note that this buffer is for last Transfer TRB in last call (see above '2nd Transfer TRB') xhci_bulk_tx()#0.event->trans_event.transfer_len:0x1000000 xhci_bulk_tx()#0.event->trans_event.flags:0x1058000 xhci_bulk_tx()#0.event->len:0 xhci_bulk_tx()#0.event->complete_code:1 xhci_bulk_tx()#1.field:0x1058000 xhci_bulk_tx()#2.TRB_TO_EP_ENDEX(field):0x4----------------ep_index is 4 rather than 5, then cause BUG() xhci_bulk_tx()#3.ep_index:0x5------------------------------------- ep_index mis-match wait for event again!!!!!!!!!---------------------------------- BUG at drivers/usb/host/xhci-ring.c:759/xhci_bulk_tx()! BUG! ### ERROR ### Please RESET the board ### Regards, Ran
Hi Ran, On Tue, Nov 10, 2020 at 4:31 PM Ran Wang <ran.wang_1@nxp.com> wrote: > > Hi Bin, > > On Tuesday, November 10, 2020 4:06 PM Ran Wang wrote: > > Hi Bin, > > > > On Tuesday, November 10, 2020 3:47 PM Bin Meng wrote: > > > > > > Hi Ran, > > <snip> > > > > > > > send request in more than 1 Transfer TRB by chaining them, but > > > > > > > then handle only 1 event TRB to mark request completed. > > > > > > > > > > > > > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we > > > > > > > observe xhci controller will generated more than 1 event TRB > > > > > > > sometimes, this cause that > > > > > > > > > > > > I am not sure if it's fair to say this is Layerscape specific > > > > > > behavior. Based on the xHCI spec, the spec indicates 1 event trb > > > > > > should be generated when IOC/ISP flag is set to 1 or an error occurs. > > > > > > > > > > Ah, ISP flag is set if the pipe is from an IN endpoint. Currently we have: > > > > > > > > > > /* Only set interrupt on short packet for IN endpoints */ > > > > > if (usb_pipein(pipe)) > > > > > field |= TRB_ISP; > > > > > > > > > > Can you verify that if removing the above codes, and without your > > > > > changes in this patch, the original issue can be resolved on LS1028? > > > > > > > > Bingo, removing above codes can resolve my issue, too > > > > > > Thank you for your testing. > > > > > > > > > > > > > I will see if I can reproduce your issue on an x86 board. > > > > > > > > > > > > > > > > Note this patch does not apply on top of u-boot/master. Please rebase. > > > > > > > > Sure, I can rebase my patch, but I am nor sure my solution is still worth: > > > > xHCI says "The detection of a USB Short Packet (i.e. the actual > > > > number of bytes received was less than the expected number of bytes > > > > defined by the Transfer TRB) during a transfer does not necessarily generate > > an Event. > > > " > > > > > > > > > > Yes, that's what the spec says. So in your case, can you add some logs > > > to verify that there is a transfer event trb generated by the xHC and > > > the completion code is 13 (short packet)? You can add some debug > > > output in record_transfer_result(). > > > > Sure, let me try this can get back to you later. > > Confirm completion code is 13 when issue happen (please search ' complete_code:13'): Thank you very much for your testing. > > dev=00000000fbb4f3c0, pipe=c0010283, buffer=00000000fbb4fd80, length=2048----------------------------buffer would cross 64KB boundary, so we will send request in more than 1 TRB, this is the key issue trigger condition > xhci_bulk_tx()#0.1.running_total:0x280 > xhci_bulk_tx()#0.2.trb_buff_len:0x280 > xhci_bulk_tx()#0.3.running_total:0x280 > xhci_bulk_tx()#0.4.num_trbs:0x2--------------------------2 Transfer TRB > xhci_bulk_tx()#0.5.running_total:0x10280 > xhci_bulk_tx()#0.start_trb:0x00000000fbb47140 > ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47140----------Assemble 1st Transfer TRB > xhci_bulk_tx()#0.addr:0xfbb4fd80 > xhci_bulk_tx()#0.trb_buff_len:0x280 > xhci_bulk_tx()#0.running_total:0x280 > xhci_bulk_tx()#0.length:0x800 > xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000 > ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47150----------Assemble 2nd Transfer TRB > xhci_bulk_tx()#0.addr:0xfbb50000 > xhci_bulk_tx()#0.trb_buff_len:0x580 > xhci_bulk_tx()#0.running_total:0x800 > xhci_bulk_tx()#0.length:0x800 > xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000 > xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb47140----------------handle event TRB for 1st Transfer TRB > xhci_bulk_tx()#0.event->trans_event.transfer_len:0xd000180 > xhci_bulk_tx()#0.event->trans_event.flags:0x1058001 > xhci_bulk_tx()#0.event->len:384 > xhci_bulk_tx()#0.event->complete_code:13 > xhci_bulk_tx()#1.field:0x1058001 > xhci_bulk_tx()#2.TRB_TO_EP_ENDEX(field):0x4 > xhci_bulk_tx()#3.ep_index:0x4----------------------------------------------------------xhci_bulk_tx() return > > dev=00000000fbb4f3c0, pipe=c0018203, buffer=00000000fbb2f9c0, length=110----------an other call comming, and for different EP (ep_index = 05) > xhci_bulk_tx()#0.1.running_total:0x640 > xhci_bulk_tx()#0.2.trb_buff_len:0x640 > xhci_bulk_tx()#0.3.running_total:0x640 > xhci_bulk_tx()#0.4.num_trbs:0x1 > xhci_bulk_tx()#0.5.running_total:0x640 > xhci_bulk_tx()#0.start_trb:0x00000000fbb47610 > ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47610---------queue 1st transfer TRB > xhci_bulk_tx()#0.addr:0xfbb2f9c0 > xhci_bulk_tx()#0.trb_buff_len:0x6e > xhci_bulk_tx()#0.running_total:0x6e > xhci_bulk_tx()#0.length:0x6e > xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000 > xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb47150------------hand event TRB, note that this buffer is for last Transfer TRB in last call (see above '2nd Transfer TRB') > xhci_bulk_tx()#0.event->trans_event.transfer_len:0x1000000 > xhci_bulk_tx()#0.event->trans_event.flags:0x1058000 > xhci_bulk_tx()#0.event->len:0 > xhci_bulk_tx()#0.event->complete_code:1 > xhci_bulk_tx()#1.field:0x1058000 > xhci_bulk_tx()#2.TRB_TO_EP_ENDEX(field):0x4----------------ep_index is 4 rather than 5, then cause BUG() > xhci_bulk_tx()#3.ep_index:0x5------------------------------------- > ep_index mis-match wait for event again!!!!!!!!!---------------------------------- > BUG at drivers/usb/host/xhci-ring.c:759/xhci_bulk_tx()! > BUG! > ### ERROR ### Please RESET the board ### > Based on your log, it matches what I suspected and we are getting close to the root cause. Could you please try the following simple test case to see if it can trigger the issue with a USB disk? After U-Boot boot on LS1028, with a USB flash disk attached, then type "usb start" to see if the USB disk can be recognized by U-Boot? commit 60e68ed667362870c20b36ae26dacc1af903e81e Author: Bin Meng <bmeng.cn@gmail.com> Date: Tue Nov 10 16:19:06 2020 +0800 WIP: usb: A simple test case to trigger the TRB 64K boundary issue with mass storage device This should not be applied as it aims to provide a simple test case to trigger the TRB 64K boundary issue as reported by Ran Wang @ NXP. Signed-off-by: Bin Meng <bmeng.cn@gmail.com> diff --git a/common/usb_storage.c b/common/usb_storage.c index ff25441..c8aec2e 100644 --- a/common/usb_storage.c +++ b/common/usb_storage.c @@ -710,7 +710,15 @@ static int usb_stor_BBB_transport(struct scsi_cmd *srb, struct us_data *us) int dir_in; int actlen, data_actlen; unsigned int pipe, pipein, pipeout; +#if 0 ALLOC_CACHE_ALIGN_BUFFER(struct umass_bbb_csw, csw, 1); +#else + struct umass_bbb_csw *csw_org, *csw; + csw_org = malloc(0x10000 + UMASS_BBB_CSW_SIZE); + csw = (struct umass_bbb_csw *)roundup((ulong)csw_org, 0x10000); + csw = (struct umass_bbb_csw *)((ulong)csw - 2); + printf("csw org %p, adjusted to %p\n", csw_org, csw); +#endif #ifdef BBB_XPORT_TRACE unsigned char *ptr; int index; @@ -824,6 +832,7 @@ again: return USB_STOR_TRANSPORT_FAILED; } + free(csw_org); return result; } Regards, Bin
Hi Bin, On Tuesday, November 10, 2020 5:03 PM Bin Meng wrote: > > Hi Ran, > > On Tue, Nov 10, 2020 at 4:31 PM Ran Wang <ran.wang_1@nxp.com> wrote: > > > > Hi Bin, > > > > On Tuesday, November 10, 2020 4:06 PM Ran Wang wrote: > > > Hi Bin, <snip> > > > > Based on your log, it matches what I suspected and we are getting close to the > root cause. > > Could you please try the following simple test case to see if it can trigger the > issue with a USB disk? > After U-Boot boot on LS1028, with a USB flash disk attached, then type "usb > start" to see if the USB disk can be recognized by U-Boot? > > commit 60e68ed667362870c20b36ae26dacc1af903e81e > Author: Bin Meng <bmeng.cn@gmail.com> > Date: Tue Nov 10 16:19:06 2020 +0800 > > WIP: usb: A simple test case to trigger the TRB 64K boundary issue with > mass storage device > > This should not be applied as it aims to provide a simple test case to > trigger the TRB 64K boundary issue as reported by Ran Wang @ NXP. > > Signed-off-by: Bin Meng <bmeng.cn@gmail.com> > > diff --git a/common/usb_storage.c b/common/usb_storage.c index > ff25441..c8aec2e 100644 > --- a/common/usb_storage.c > +++ b/common/usb_storage.c > @@ -710,7 +710,15 @@ static int usb_stor_BBB_transport(struct scsi_cmd > *srb, struct us_data *us) > int dir_in; > int actlen, data_actlen; > unsigned int pipe, pipein, pipeout; > +#if 0 > ALLOC_CACHE_ALIGN_BUFFER(struct umass_bbb_csw, csw, 1); > +#else > + struct umass_bbb_csw *csw_org, *csw; > + csw_org = malloc(0x10000 + UMASS_BBB_CSW_SIZE); > + csw = (struct umass_bbb_csw *)roundup((ulong)csw_org, 0x10000); > + csw = (struct umass_bbb_csw *)((ulong)csw - 2); > + printf("csw org %p, adjusted to %p\n", csw_org, csw); #endif > #ifdef BBB_XPORT_TRACE > unsigned char *ptr; > int index; > @@ -824,6 +832,7 @@ again: > return USB_STOR_TRANSPORT_FAILED; > } > > + free(csw_org); > return result; > } > Issue cannot be reproduced on mass storage device with above hacking: => resetting USB... Bus usb3@3100000: Register 200017f NbrPorts 2 Starting the controller USB XHCI 1.00 Bus usb3@3110000: Register 200017f NbrPorts 2 Starting the controller USB XHCI 1.00 scanning bus usb3@3100000 for devices... record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 13 record_transfer_result():comp_code: 13 record_transfer_result():comp_code: 13 record_transfer_result():comp_code: 13 record_transfer_result():comp_code: 1 csw org 00000000fbb53010, adjusted to 00000000fbb5fffe record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 csw org 00000000fbb53010, adjusted to 00000000fbb5fffe record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 csw org 00000000fbb53010, adjusted to 00000000fbb5fffe record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 2 USB Device(s) found scanning bus usb3@3110000 for devices... 1 USB Device(s) found scanning usb for storage devices... 1 Storage Device(s) found => One more thing I'd like to tell is that I found this issue would not happened with 'usb start' (even with USB dongle). Below is the log I tested with ' usb read 0x82000000 0 300000', issue could not be reproduced for at least 10 minutes ... csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 record_transfer_result():comp_code: 1 ... Regards, Ran
Hi Ran, On Tue, Nov 10, 2020 at 5:41 PM Ran Wang <ran.wang_1@nxp.com> wrote: > > Hi Bin, > > On Tuesday, November 10, 2020 5:03 PM Bin Meng wrote: > > > > Hi Ran, > > > > On Tue, Nov 10, 2020 at 4:31 PM Ran Wang <ran.wang_1@nxp.com> wrote: > > > > > > Hi Bin, > > > > > > On Tuesday, November 10, 2020 4:06 PM Ran Wang wrote: > > > > Hi Bin, > > <snip> > > > > > > > > Based on your log, it matches what I suspected and we are getting close to the > > root cause. > > > > Could you please try the following simple test case to see if it can trigger the > > issue with a USB disk? > > After U-Boot boot on LS1028, with a USB flash disk attached, then type "usb > > start" to see if the USB disk can be recognized by U-Boot? > > > > commit 60e68ed667362870c20b36ae26dacc1af903e81e > > Author: Bin Meng <bmeng.cn@gmail.com> > > Date: Tue Nov 10 16:19:06 2020 +0800 > > > > WIP: usb: A simple test case to trigger the TRB 64K boundary issue with > > mass storage device > > > > This should not be applied as it aims to provide a simple test case to > > trigger the TRB 64K boundary issue as reported by Ran Wang @ NXP. > > > > Signed-off-by: Bin Meng <bmeng.cn@gmail.com> > > > > diff --git a/common/usb_storage.c b/common/usb_storage.c index > > ff25441..c8aec2e 100644 > > --- a/common/usb_storage.c > > +++ b/common/usb_storage.c > > @@ -710,7 +710,15 @@ static int usb_stor_BBB_transport(struct scsi_cmd > > *srb, struct us_data *us) > > int dir_in; > > int actlen, data_actlen; > > unsigned int pipe, pipein, pipeout; > > +#if 0 > > ALLOC_CACHE_ALIGN_BUFFER(struct umass_bbb_csw, csw, 1); > > +#else > > + struct umass_bbb_csw *csw_org, *csw; > > + csw_org = malloc(0x10000 + UMASS_BBB_CSW_SIZE); > > + csw = (struct umass_bbb_csw *)roundup((ulong)csw_org, 0x10000); > > + csw = (struct umass_bbb_csw *)((ulong)csw - 2); > > + printf("csw org %p, adjusted to %p\n", csw_org, csw); #endif > > #ifdef BBB_XPORT_TRACE > > unsigned char *ptr; > > int index; > > @@ -824,6 +832,7 @@ again: > > return USB_STOR_TRANSPORT_FAILED; > > } > > > > + free(csw_org); > > return result; > > } > > > > Issue cannot be reproduced on mass storage device with above hacking: Thank you for the testing. > => > resetting USB... > Bus usb3@3100000: Register 200017f NbrPorts 2 > Starting the controller > USB XHCI 1.00 > Bus usb3@3110000: Register 200017f NbrPorts 2 > Starting the controller > USB XHCI 1.00 > scanning bus usb3@3100000 for devices... record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 13 > record_transfer_result():comp_code: 13 > record_transfer_result():comp_code: 13 > record_transfer_result():comp_code: 13 > record_transfer_result():comp_code: 1 > csw org 00000000fbb53010, adjusted to 00000000fbb5fffe > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > csw org 00000000fbb53010, adjusted to 00000000fbb5fffe > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > csw org 00000000fbb53010, adjusted to 00000000fbb5fffe > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > 2 USB Device(s) found > scanning bus usb3@3110000 for devices... 1 USB Device(s) found > scanning usb for storage devices... 1 Storage Device(s) found > => > > One more thing I'd like to tell is that I found this issue would not happened with 'usb start' (even with USB dongle). > > Below is the log I tested with ' usb read 0x82000000 0 300000', issue could not be reproduced for at least 10 minutes > ... > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 > record_transfer_result():comp_code: 1 In the earlier log you sent, you have the following: dev=00000000fbb4f3c0, pipe=c0010283, buffer=00000000fbb4fd80, length=2048----------------------------buffer would cross 64KB boundary, so we will send request in more than 1 TRB, this is the key issue trigger condition xhci_bulk_tx()#0.1.running_total:0x280 xhci_bulk_tx()#0.2.trb_buff_len:0x280 xhci_bulk_tx()#0.3.running_total:0x280 xhci_bulk_tx()#0.4.num_trbs:0x2--------------------------2 Transfer TRB xhci_bulk_tx()#0.5.running_total:0x10280 xhci_bulk_tx()#0.start_trb:0x00000000fbb47140 ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47140----------Assemble 1st Transfer TRB xhci_bulk_tx()#0.addr:0xfbb4fd80 xhci_bulk_tx()#0.trb_buff_len:0x280 xhci_bulk_tx()#0.running_total:0x280 xhci_bulk_tx()#0.length:0x800 xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000 ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47150----------Assemble 2nd Transfer TRB xhci_bulk_tx()#0.addr:0xfbb50000 xhci_bulk_tx()#0.trb_buff_len:0x580 xhci_bulk_tx()#0.running_total:0x800 xhci_bulk_tx()#0.length:0x800 xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000 Could you please specify where this output is for? "xhci_bulk_tx()#0.length:0x800" Regards, Bin
Hi Bin, On Tuesday, November 10, 2020 5:47 PM Bin Meng wrote: > > Hi Ran, > > On Tue, Nov 10, 2020 at 5:41 PM Ran Wang <ran.wang_1@nxp.com> wrote: > > > > Hi Bin, > > > > On Tuesday, November 10, 2020 5:03 PM Bin Meng wrote: > > > > > > Hi Ran, > > > > > > On Tue, Nov 10, 2020 at 4:31 PM Ran Wang <ran.wang_1@nxp.com> > wrote: > > > > > > > > Hi Bin, > > > > > > > > On Tuesday, November 10, 2020 4:06 PM Ran Wang wrote: > > > > > Hi Bin, > > > > <snip> > > > > > > > > > > > > Based on your log, it matches what I suspected and we are getting > > > close to the root cause. > > > > > > Could you please try the following simple test case to see if it can > > > trigger the issue with a USB disk? > > > After U-Boot boot on LS1028, with a USB flash disk attached, then > > > type "usb start" to see if the USB disk can be recognized by U-Boot? > > > > > > commit 60e68ed667362870c20b36ae26dacc1af903e81e > > > Author: Bin Meng <bmeng.cn@gmail.com> > > > Date: Tue Nov 10 16:19:06 2020 +0800 > > > > > > WIP: usb: A simple test case to trigger the TRB 64K boundary > > > issue with mass storage device > > > > > > This should not be applied as it aims to provide a simple test case to > > > trigger the TRB 64K boundary issue as reported by Ran Wang @ NXP. > > > > > > Signed-off-by: Bin Meng <bmeng.cn@gmail.com> > > > > > > diff --git a/common/usb_storage.c b/common/usb_storage.c index > > > ff25441..c8aec2e 100644 > > > --- a/common/usb_storage.c > > > +++ b/common/usb_storage.c > > > @@ -710,7 +710,15 @@ static int usb_stor_BBB_transport(struct > > > scsi_cmd *srb, struct us_data *us) > > > int dir_in; > > > int actlen, data_actlen; > > > unsigned int pipe, pipein, pipeout; > > > +#if 0 > > > ALLOC_CACHE_ALIGN_BUFFER(struct umass_bbb_csw, csw, 1); > > > +#else > > > + struct umass_bbb_csw *csw_org, *csw; > > > + csw_org = malloc(0x10000 + UMASS_BBB_CSW_SIZE); > > > + csw = (struct umass_bbb_csw *)roundup((ulong)csw_org, > 0x10000); > > > + csw = (struct umass_bbb_csw *)((ulong)csw - 2); > > > + printf("csw org %p, adjusted to %p\n", csw_org, csw); #endif > > > #ifdef BBB_XPORT_TRACE > > > unsigned char *ptr; > > > int index; > > > @@ -824,6 +832,7 @@ again: > > > return USB_STOR_TRANSPORT_FAILED; > > > } > > > > > > + free(csw_org); > > > return result; > > > } > > > > > > > Issue cannot be reproduced on mass storage device with above hacking: > > Thank you for the testing. > > > => > > resetting USB... > > Bus usb3@3100000: Register 200017f NbrPorts 2 Starting the controller > > USB XHCI 1.00 Bus usb3@3110000: Register 200017f NbrPorts 2 Starting > > the controller USB XHCI 1.00 scanning bus usb3@3100000 for devices... > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 13 > > record_transfer_result():comp_code: 13 > > record_transfer_result():comp_code: 13 > > record_transfer_result():comp_code: 13 > > record_transfer_result():comp_code: 1 > > csw org 00000000fbb53010, adjusted to 00000000fbb5fffe > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > csw org 00000000fbb53010, adjusted to 00000000fbb5fffe > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > csw org 00000000fbb53010, adjusted to 00000000fbb5fffe > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > 2 USB Device(s) found > > scanning bus usb3@3110000 for devices... 1 USB Device(s) found > > scanning usb for storage devices... 1 Storage Device(s) found > > => > > > > One more thing I'd like to tell is that I found this issue would not happened > with 'usb start' (even with USB dongle). > > > > Below is the log I tested with ' usb read 0x82000000 0 300000', issue > > could not be reproduced for at least 10 minutes ... > > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > record_transfer_result():comp_code: 1 > > In the earlier log you sent, you have the following: > > dev=00000000fbb4f3c0, pipe=c0010283, buffer=00000000fbb4fd80, > length=2048----------------------------buffer would cross 64KB boundary, so we will > send request in more than 1 TRB, this is the key issue trigger condition > xhci_bulk_tx()#0.1.running_total:0x280 > xhci_bulk_tx()#0.2.trb_buff_len:0x280 > xhci_bulk_tx()#0.3.running_total:0x280 > xhci_bulk_tx()#0.4.num_trbs:0x2--------------------------2 Transfer TRB > xhci_bulk_tx()#0.5.running_total:0x10280 > xhci_bulk_tx()#0.start_trb:0x00000000fbb47140 > ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47140--------- > -Assemble > 1st Transfer TRB > xhci_bulk_tx()#0.addr:0xfbb4fd80 > xhci_bulk_tx()#0.trb_buff_len:0x280 > xhci_bulk_tx()#0.running_total:0x280 > xhci_bulk_tx()#0.length:0x800 > xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000 > ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47150--------- > -Assemble > 2nd Transfer TRB > xhci_bulk_tx()#0.addr:0xfbb50000 > xhci_bulk_tx()#0.trb_buff_len:0x580 > xhci_bulk_tx()#0.running_total:0x800 > xhci_bulk_tx()#0.length:0x800 > xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000 > > Could you please specify where this output is for? All this prints is within function xhci_bulk_tx(), Let me open those verbose output and check if that crossing 64KB case has been triggered or not. But I have to get back to you later (tomorrow). Regards, Ran > "xhci_bulk_tx()#0.length:0x800" > > Regards, > Bin
Hi Bin On Tuesday, November 10, 2020 5:55 PM, Ran Wang wrote: > > Hi Bin, > <snip> > > > record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90, > > > adjusted to 00000000fbb4fffe > > > record_transfer_result():comp_code: 1 > > > record_transfer_result():comp_code: 1 > > > record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90, > > > adjusted to 00000000fbb4fffe > > > record_transfer_result():comp_code: 1 > > > record_transfer_result():comp_code: 1 > > > record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90, > > > adjusted to 00000000fbb4fffe > > > record_transfer_result():comp_code: 1 > > > record_transfer_result():comp_code: 1 > > > record_transfer_result():comp_code: 1 > > > > In the earlier log you sent, you have the following: > > > > dev=00000000fbb4f3c0, pipe=c0010283, buffer=00000000fbb4fd80, > > length=2048----------------------------buffer would cross 64KB > > boundary, so we will send request in more than 1 TRB, this is the key > > issue trigger condition > > xhci_bulk_tx()#0.1.running_total:0x280 > > xhci_bulk_tx()#0.2.trb_buff_len:0x280 > > xhci_bulk_tx()#0.3.running_total:0x280 > > xhci_bulk_tx()#0.4.num_trbs:0x2--------------------------2 Transfer > > TRB > > xhci_bulk_tx()#0.5.running_total:0x10280 > > xhci_bulk_tx()#0.start_trb:0x00000000fbb47140 > > ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47140- > > -------- > > -Assemble > > 1st Transfer TRB > > xhci_bulk_tx()#0.addr:0xfbb4fd80 > > xhci_bulk_tx()#0.trb_buff_len:0x280 > > xhci_bulk_tx()#0.running_total:0x280 > > xhci_bulk_tx()#0.length:0x800 > > xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000 > > ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47150- > > -------- > > -Assemble > > 2nd Transfer TRB > > xhci_bulk_tx()#0.addr:0xfbb50000 > > xhci_bulk_tx()#0.trb_buff_len:0x580 > > xhci_bulk_tx()#0.running_total:0x800 > > xhci_bulk_tx()#0.length:0x800 > > xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000 > > > > Could you please specify where this output is for? > > All this prints is within function xhci_bulk_tx(), > > Let me open those verbose output and check if that crossing 64KB case has been > triggered or not. > But I have to get back to you later (tomorrow). With your test commit and my verbose print code, issue cannot be reproduced in mass storage usage (although crossing 64KB boundary condition has met), so looks like the only different to ethernet device is that if xHC would receive short packet: => pri t t=usb read 0x82000000 0 300000 => run t usb read: device 0 block # 0, count 3145728 ... usb_stor_BBB_transport()csw org 00000000fbb4ac90, adjusted to 00000000fbb4fffe xhci_bulk_tx()dev=00000000fbb49950, pipe=c0010203, buffer=00000000fbb1a9c0, length=0x1f xhci_bulk_tx()#0.4.num_trbs:0x1 ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb49570 xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49570 --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb49570 xhci_bulk_tx()#0.event->complete_code:1 xhci_bulk_tx()#3.ep_index:0x3 xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283, buffer=00000000fbb1ab00, length=0x200 xhci_bulk_tx()#0.4.num_trbs:0x1 xhci_bulk_tx()#set TRB_ISP------------------------------------ ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb49180 xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49180 --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb49180 xhci_bulk_tx()#0.event->complete_code:1 xhci_bulk_tx()#3.ep_index:0x2 xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283, buffer=00000000fbb4fffe, length=0xd-----------------buffer cross 64KB boundary xhci_bulk_tx()#0.4.num_trbs:0x2 xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb49190----------------1st transfer TRB queued xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49190 xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb491a0----------------2nd transfer TRB queued xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb491a0 --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb491a0---------------------event TRB that pointing to 2nd transfer TRB xhci_bulk_tx()#0.event->complete_code:1-----------------------------------------------------------------not short packet xhci_bulk_tx()#3.ep_index:0x2 num_trbs_count > 1!!!!!!!!! ... usb_stor_BBB_transport()csw org 00000000fbb4aea0, adjusted to 00000000fbb4fffe xhci_bulk_tx()dev=00000000fbb49950, pipe=c0010203, buffer=00000000fbb1ad00, length=0x1f xhci_bulk_tx()#0.4.num_trbs:0x1 ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb49790 xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49790 --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb49790 xhci_bulk_tx()#0.event->complete_code:1 xhci_bulk_tx()#3.ep_index:0x3 xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283, buffer=00000000883d8000, length=0x1e000-----------------buffer cross 64KB boundary xhci_bulk_tx()#0.4.num_trbs:0x3 xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492a0----------------1st transfer TRB queued xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492a0 xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492b0----------------2nd transfer TRB queued xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492b0 xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492c0----------------3rd transfer TRB queued xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492c0 --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb492c0---------------------event TRB that pointing to 3rd transfer TRB xhci_bulk_tx()#0.event->complete_code:1-----------------------------------------------------------------not short packet xhci_bulk_tx()#3.ep_index:0x2 num_trbs_count > 1!!!!!!!!! xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283, buffer=00000000fbb4fffe, length=0xd xhci_bulk_tx()#0.4.num_trbs:0x2 xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492d0 xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492d0 xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492e0 xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492e0 --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb492e0 xhci_bulk_tx()#0.event->complete_code:1 xhci_bulk_tx()#3.ep_index:0x2 num_trbs_count > 1!!!!!!!!! Regards Ran
Hi Ran, On Wed, Nov 11, 2020 at 4:07 PM Ran Wang <ran.wang_1@nxp.com> wrote: > > Hi Bin > > On Tuesday, November 10, 2020 5:55 PM, Ran Wang wrote: > > > > Hi Bin, > > > > <snip> > > > > > record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90, > > > > adjusted to 00000000fbb4fffe > > > > record_transfer_result():comp_code: 1 > > > > record_transfer_result():comp_code: 1 > > > > record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90, > > > > adjusted to 00000000fbb4fffe > > > > record_transfer_result():comp_code: 1 > > > > record_transfer_result():comp_code: 1 > > > > record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90, > > > > adjusted to 00000000fbb4fffe > > > > record_transfer_result():comp_code: 1 > > > > record_transfer_result():comp_code: 1 > > > > record_transfer_result():comp_code: 1 > > > > > > In the earlier log you sent, you have the following: > > > > > > dev=00000000fbb4f3c0, pipe=c0010283, buffer=00000000fbb4fd80, > > > length=2048----------------------------buffer would cross 64KB > > > boundary, so we will send request in more than 1 TRB, this is the key > > > issue trigger condition > > > xhci_bulk_tx()#0.1.running_total:0x280 > > > xhci_bulk_tx()#0.2.trb_buff_len:0x280 > > > xhci_bulk_tx()#0.3.running_total:0x280 > > > xhci_bulk_tx()#0.4.num_trbs:0x2--------------------------2 Transfer > > > TRB > > > xhci_bulk_tx()#0.5.running_total:0x10280 > > > xhci_bulk_tx()#0.start_trb:0x00000000fbb47140 > > > ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47140- > > > -------- > > > -Assemble > > > 1st Transfer TRB > > > xhci_bulk_tx()#0.addr:0xfbb4fd80 > > > xhci_bulk_tx()#0.trb_buff_len:0x280 > > > xhci_bulk_tx()#0.running_total:0x280 > > > xhci_bulk_tx()#0.length:0x800 > > > xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000 > > > ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47150- > > > -------- > > > -Assemble > > > 2nd Transfer TRB > > > xhci_bulk_tx()#0.addr:0xfbb50000 > > > xhci_bulk_tx()#0.trb_buff_len:0x580 > > > xhci_bulk_tx()#0.running_total:0x800 > > > xhci_bulk_tx()#0.length:0x800 > > > xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000 > > > > > > Could you please specify where this output is for? > > > > All this prints is within function xhci_bulk_tx(), > > > > Let me open those verbose output and check if that crossing 64KB case has been > > triggered or not. > > But I have to get back to you later (tomorrow). > > With your test commit and my verbose print code, issue cannot be reproduced in mass storage usage > (although crossing 64KB boundary condition has met), so looks like the only different to ethernet device > is that if xHC would receive short packet: > Thank you. > => pri t > t=usb read 0x82000000 0 300000 > => run t > > usb read: device 0 block # 0, count 3145728 ... > usb_stor_BBB_transport()csw org 00000000fbb4ac90, adjusted to 00000000fbb4fffe > xhci_bulk_tx()dev=00000000fbb49950, pipe=c0010203, buffer=00000000fbb1a9c0, length=0x1f > xhci_bulk_tx()#0.4.num_trbs:0x1 > ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb49570 > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49570 > --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb49570 > xhci_bulk_tx()#0.event->complete_code:1 > xhci_bulk_tx()#3.ep_index:0x3 > xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283, buffer=00000000fbb1ab00, length=0x200 > xhci_bulk_tx()#0.4.num_trbs:0x1 > xhci_bulk_tx()#set TRB_ISP------------------------------------ > ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb49180 > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49180 > --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb49180 > xhci_bulk_tx()#0.event->complete_code:1 > xhci_bulk_tx()#3.ep_index:0x2 > xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283, buffer=00000000fbb4fffe, length=0xd-----------------buffer cross 64KB boundary > xhci_bulk_tx()#0.4.num_trbs:0x2 > xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP > ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb49190----------------1st transfer TRB queued > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49190 > xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP > ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb491a0----------------2nd transfer TRB queued > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb491a0 > --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb491a0---------------------event TRB that pointing to 2nd transfer TRB > xhci_bulk_tx()#0.event->complete_code:1-----------------------------------------------------------------not short packet > xhci_bulk_tx()#3.ep_index:0x2 > num_trbs_count > 1!!!!!!!!! > ... > usb_stor_BBB_transport()csw org 00000000fbb4aea0, adjusted to 00000000fbb4fffe > xhci_bulk_tx()dev=00000000fbb49950, pipe=c0010203, buffer=00000000fbb1ad00, length=0x1f > xhci_bulk_tx()#0.4.num_trbs:0x1 > ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb49790 > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49790 > --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb49790 > xhci_bulk_tx()#0.event->complete_code:1 > xhci_bulk_tx()#3.ep_index:0x3 > xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283, buffer=00000000883d8000, length=0x1e000-----------------buffer cross 64KB boundary > xhci_bulk_tx()#0.4.num_trbs:0x3 > xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP > ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492a0----------------1st transfer TRB queued > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492a0 > xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP > ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492b0----------------2nd transfer TRB queued > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492b0 > xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP > ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492c0----------------3rd transfer TRB queued > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492c0 > --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb492c0---------------------event TRB that pointing to 3rd transfer TRB > xhci_bulk_tx()#0.event->complete_code:1-----------------------------------------------------------------not short packet > xhci_bulk_tx()#3.ep_index:0x2 > num_trbs_count > 1!!!!!!!!! > xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283, buffer=00000000fbb4fffe, length=0xd > xhci_bulk_tx()#0.4.num_trbs:0x2 > xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP > ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492d0 > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492d0 > xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP > ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492e0 > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492e0 > --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb492e0 > xhci_bulk_tx()#0.event->complete_code:1 > xhci_bulk_tx()#3.ep_index:0x2 > num_trbs_count > 1!!!!!!!!! Please rebase this patch against u-boot/master, and reword the following commit message to indicate that it was because of the ISP flag set on an input pipe. In the 64K boundary case, when the first TRB ends up with a short packet condition it will trigger an short packet code transfer event per ISP flag, and current codes will only handle the first transfer event. > However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe xhci > controller will generated more than 1 event TRB sometimes, this cause that > function mishandle event TRB in next round call, then system hang due to > BUG() checking. Regards, Bin
Hi Bin, On Friday, November 13, 2020 2:25 PM Bin Meng wrote: > > Hi Ran, > > On Wed, Nov 11, 2020 at 4:07 PM Ran Wang <ran.wang_1@nxp.com> wrote: > > > > Hi Bin > > > > On Tuesday, November 10, 2020 5:55 PM, Ran Wang wrote: > > > > > > Hi Bin, > > > > > > > <snip> > > > > > > > record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90, > > > > > adjusted to 00000000fbb4fffe > > > > > record_transfer_result():comp_code: 1 > > > > > record_transfer_result():comp_code: 1 > > > > > record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90, > > > > > adjusted to 00000000fbb4fffe > > > > > record_transfer_result():comp_code: 1 > > > > > record_transfer_result():comp_code: 1 > > > > > record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90, > > > > > adjusted to 00000000fbb4fffe > > > > > record_transfer_result():comp_code: 1 > > > > > record_transfer_result():comp_code: 1 > > > > > record_transfer_result():comp_code: 1 > > > > > > > > In the earlier log you sent, you have the following: > > > > > > > > dev=00000000fbb4f3c0, pipe=c0010283, buffer=00000000fbb4fd80, > > > > length=2048----------------------------buffer would cross 64KB > > > > boundary, so we will send request in more than 1 TRB, this is the > > > > key issue trigger condition > > > > xhci_bulk_tx()#0.1.running_total:0x280 > > > > xhci_bulk_tx()#0.2.trb_buff_len:0x280 > > > > xhci_bulk_tx()#0.3.running_total:0x280 > > > > xhci_bulk_tx()#0.4.num_trbs:0x2--------------------------2 > > > > Transfer TRB > > > > xhci_bulk_tx()#0.5.running_total:0x10280 > > > > xhci_bulk_tx()#0.start_trb:0x00000000fbb47140 > > > > ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47 > > > > 140- > > > > -------- > > > > -Assemble > > > > 1st Transfer TRB > > > > xhci_bulk_tx()#0.addr:0xfbb4fd80 > > > > xhci_bulk_tx()#0.trb_buff_len:0x280 > > > > xhci_bulk_tx()#0.running_total:0x280 > > > > xhci_bulk_tx()#0.length:0x800 > > > > xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000 > > > > ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47 > > > > 150- > > > > -------- > > > > -Assemble > > > > 2nd Transfer TRB > > > > xhci_bulk_tx()#0.addr:0xfbb50000 > > > > xhci_bulk_tx()#0.trb_buff_len:0x580 > > > > xhci_bulk_tx()#0.running_total:0x800 > > > > xhci_bulk_tx()#0.length:0x800 > > > > xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000 > > > > > > > > Could you please specify where this output is for? > > > > > > All this prints is within function xhci_bulk_tx(), > > > > > > Let me open those verbose output and check if that crossing 64KB > > > case has been triggered or not. > > > But I have to get back to you later (tomorrow). > > > > With your test commit and my verbose print code, issue cannot be > > reproduced in mass storage usage (although crossing 64KB boundary > > condition has met), so looks like the only different to ethernet device is that if > xHC would receive short packet: > > > > Thank you. > > > => pri t > > t=usb read 0x82000000 0 300000 > > => run t > > > > usb read: device 0 block # 0, count 3145728 ... > > usb_stor_BBB_transport()csw org 00000000fbb4ac90, adjusted to > > 00000000fbb4fffe xhci_bulk_tx()dev=00000000fbb49950, pipe=c0010203, > > buffer=00000000fbb1a9c0, length=0x1f > > xhci_bulk_tx()#0.4.num_trbs:0x1 > > > ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb4957 > > ++++++++++++0 > > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49570 > > --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb > > 49570 > > xhci_bulk_tx()#0.event->complete_code:1 > > xhci_bulk_tx()#3.ep_index:0x3 > > xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283, > > buffer=00000000fbb1ab00, length=0x200 > > xhci_bulk_tx()#0.4.num_trbs:0x1 > > xhci_bulk_tx()#set TRB_ISP------------------------------------ > > > ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb4918 > > ++++++++++++0 > > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49180 > > --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb > > 49180 > > xhci_bulk_tx()#0.event->complete_code:1 > > xhci_bulk_tx()#3.ep_index:0x2 > > xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283, > > buffer=00000000fbb4fffe, length=0xd-----------------buffer cross 64KB > > boundary > > xhci_bulk_tx()#0.4.num_trbs:0x2 > > xhci_bulk_tx()#set > > TRB_ISP--------------------------------------------------------------- > > ------------------------Set ISP > > > ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb4919 > > ++++++++++++0----------------1st transfer TRB queued > > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49190 > > xhci_bulk_tx()#set > > TRB_ISP--------------------------------------------------------------- > > ------------------------Set ISP > > > ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb491a > > ++++++++++++0----------------2nd transfer TRB queued > > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb491a0 > > --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb > > 491a0---------------------event TRB that pointing to 2nd transfer TRB > > xhci_bulk_tx()#0.event->complete_code:1------------------------------- > > ----------------------------------not short packet > > xhci_bulk_tx()#3.ep_index:0x2 > > num_trbs_count > 1!!!!!!!!! > > ... > > usb_stor_BBB_transport()csw org 00000000fbb4aea0, adjusted to > > 00000000fbb4fffe xhci_bulk_tx()dev=00000000fbb49950, pipe=c0010203, > > buffer=00000000fbb1ad00, length=0x1f > > xhci_bulk_tx()#0.4.num_trbs:0x1 > > > ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb4979 > > ++++++++++++0 > > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49790 > > --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb > > 49790 > > xhci_bulk_tx()#0.event->complete_code:1 > > xhci_bulk_tx()#3.ep_index:0x3 > > xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283, > > buffer=00000000883d8000, length=0x1e000-----------------buffer cross > > 64KB boundary > > xhci_bulk_tx()#0.4.num_trbs:0x3 > > xhci_bulk_tx()#set > > TRB_ISP--------------------------------------------------------------- > > ------------------------Set ISP > > > ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492a > > ++++++++++++0----------------1st transfer TRB queued > > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492a0 > > xhci_bulk_tx()#set > > TRB_ISP--------------------------------------------------------------- > > ------------------------Set ISP > > > ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492b > > ++++++++++++0----------------2nd transfer TRB queued > > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492b0 > > xhci_bulk_tx()#set > > TRB_ISP--------------------------------------------------------------- > > ------------------------Set ISP > > > ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492c > > ++++++++++++0----------------3rd transfer TRB queued > > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492c0 > > --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb > > 492c0---------------------event TRB that pointing to 3rd transfer TRB > > xhci_bulk_tx()#0.event->complete_code:1------------------------------- > > ----------------------------------not short packet > > xhci_bulk_tx()#3.ep_index:0x2 > > num_trbs_count > 1!!!!!!!!! > > xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283, > > buffer=00000000fbb4fffe, length=0xd > > xhci_bulk_tx()#0.4.num_trbs:0x2 > > xhci_bulk_tx()#set > > TRB_ISP--------------------------------------------------------------- > > ------------------------Set ISP > > > ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492d > > ++++++++++++0 > > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492d0 > > xhci_bulk_tx()#set > > TRB_ISP--------------------------------------------------------------- > > ------------------------Set ISP > > > ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492e > > ++++++++++++0 > > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492e0 > > --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb > > 492e0 > > xhci_bulk_tx()#0.event->complete_code:1 > > xhci_bulk_tx()#3.ep_index:0x2 > > num_trbs_count > 1!!!!!!!!! > > Please rebase this patch against u-boot/master, and reword the following > commit message to indicate that it was because of the ISP flag set on an input > pipe. In the 64K boundary case, when the first TRB ends up with a short packet > condition it will trigger an short packet code transfer event per ISP flag, and > current codes will only handle the first transfer event. Got it, will work out v2, thanks for the review. Regards, Ran > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe > > xhci controller will generated more than 1 event TRB sometimes, this > > cause that function mishandle event TRB in next round call, then > > system hang due to > > BUG() checking. > > Regards, > Bin
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 092ed6e..d77e058 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -578,10 +578,13 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe, int ret; u32 trb_fields[4]; u64 val_64 = virt_to_phys(buffer); + void *last_transfer_trb_addr; + int available_length; debug("dev=%p, pipe=%lx, buffer=%p, length=%d\n", udev, pipe, buffer, length); + available_length = length; ep_index = usb_pipe_ep_index(pipe); virt_dev = ctrl->devs[slot_id]; @@ -701,7 +704,7 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe, trb_fields[2] = length_field; trb_fields[3] = field | (TRB_NORMAL << TRB_TYPE_SHIFT); - queue_trb(ctrl, ring, (num_trbs > 1), trb_fields); + last_transfer_trb_addr = queue_trb(ctrl, ring, (num_trbs > 1), trb_fields); --num_trbs; @@ -714,6 +717,7 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe, giveback_first_trb(udev, ep_index, start_cycle, start_trb); +again: event = xhci_wait_for_event(ctrl, TRB_TRANSFER); if (!event) { debug("XHCI bulk transfer timed out, aborting...\n"); @@ -722,14 +726,21 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe, udev->act_len = 0; return -ETIMEDOUT; } - field = le32_to_cpu(event->trans_event.flags); + if ((void *)event->trans_event.buffer != last_transfer_trb_addr) { + available_length -= + (int)EVENT_TRB_LEN(le32_to_cpu(event->trans_event.transfer_len)); + xhci_acknowledge_event(ctrl); + goto again; + } + + field = le32_to_cpu(event->trans_event.flags); BUG_ON(TRB_TO_SLOT_ID(field) != slot_id); BUG_ON(TRB_TO_EP_INDEX(field) != ep_index); BUG_ON(*(void **)(uintptr_t)le64_to_cpu(event->trans_event.buffer) - buffer > (size_t)length); - record_transfer_result(udev, event, length); + record_transfer_result(udev, event, available_length); xhci_acknowledge_event(ctrl); xhci_inval_cache((uintptr_t)buffer, length);
In functiion xhci_bulk_tx(), when buffer cross 64KB boundary, it will send request in more than 1 Transfer TRB by chaining them, but then handle only 1 event TRB to mark request completed. However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe xhci controller will generated more than 1 event TRB sometimes, this cause that function mishandle event TRB in next round call, then system hang due to BUG() checking. This patch adds a loop to make sure the event TRB for last Transfer TRB has to be handled in time. Signed-off-by: Ran Wang <ran.wang_1@nxp.com> --- drivers/usb/host/xhci-ring.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-)