diff mbox series

usb: xhci: fix event trb handling missed

Message ID 20200922045422.27973-1-ran.wang_1@nxp.com
State Superseded
Delegated to: Marek Vasut
Headers show
Series usb: xhci: fix event trb handling missed | expand

Commit Message

Ran Wang Sept. 22, 2020, 4:54 a.m. UTC
In functiion xhci_bulk_tx(), when buffer cross 64KB boundary, it will
send request in more than 1 Transfer TRB by chaining them, but then handle
only 1 event TRB to mark request completed.

However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe xhci
controller will generated more than 1 event TRB sometimes, this cause that
function mishandle event TRB in next round call, then system hang due to
BUG() checking.

This patch adds a loop to make sure the event TRB for last Transfer TRB has
to be handled in time.

Signed-off-by: Ran Wang <ran.wang_1@nxp.com>
---
 drivers/usb/host/xhci-ring.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

Comments

Marek Vasut Sept. 22, 2020, 11:41 a.m. UTC | #1
On 9/22/20 6:54 AM, Ran Wang wrote:
> In functiion xhci_bulk_tx(), when buffer cross 64KB boundary, it will
> send request in more than 1 Transfer TRB by chaining them, but then handle
> only 1 event TRB to mark request completed.
> 
> However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe xhci
> controller will generated more than 1 event TRB sometimes, this cause that
> function mishandle event TRB in next round call, then system hang due to
> BUG() checking.
> 
> This patch adds a loop to make sure the event TRB for last Transfer TRB has
> to be handled in time.

Bin, can you please take a look at this one ? Thanks
Bin Meng Nov. 6, 2020, 7:54 a.m. UTC | #2
On Tue, Sep 22, 2020 at 7:41 PM Marek Vasut <marex@denx.de> wrote:
>
> On 9/22/20 6:54 AM, Ran Wang wrote:
> > In functiion xhci_bulk_tx(), when buffer cross 64KB boundary, it will
> > send request in more than 1 Transfer TRB by chaining them, but then handle
> > only 1 event TRB to mark request completed.
> >
> > However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe xhci
> > controller will generated more than 1 event TRB sometimes, this cause that
> > function mishandle event TRB in next round call, then system hang due to
> > BUG() checking.
> >
> > This patch adds a loop to make sure the event TRB for last Transfer TRB has
> > to be handled in time.
>
> Bin, can you please take a look at this one ? Thanks

Sorry I missed this one. Will take a look at this soon.

Regards,
Bin
Ran Wang Nov. 9, 2020, 12:29 a.m. UTC | #3
Hi Bin,

On Friday, November 6, 2020 3:54 PM, Bin Meng wrote:
> To: Marek Vasut <marex@denx.de>
> Cc: Ran Wang <ran.wang_1@nxp.com>; U-Boot Mailing List
> <u-boot@lists.denx.de>
> Subject: Re: [PATCH] usb: xhci: fix event trb handling missed
> 
> On Tue, Sep 22, 2020 at 7:41 PM Marek Vasut <marex@denx.de> wrote:
> >
> > On 9/22/20 6:54 AM, Ran Wang wrote:
> > > In functiion xhci_bulk_tx(), when buffer cross 64KB boundary, it
> > > will send request in more than 1 Transfer TRB by chaining them, but
> > > then handle only 1 event TRB to mark request completed.
> > >
> > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe
> > > xhci controller will generated more than 1 event TRB sometimes, this
> > > cause that function mishandle event TRB in next round call, then
> > > system hang due to
> > > BUG() checking.
> > >
> > > This patch adds a loop to make sure the event TRB for last Transfer
> > > TRB has to be handled in time.
> >
> > Bin, can you please take a look at this one ? Thanks
> 
> Sorry I missed this one. Will take a look at this soon.

Kindly remind that this patch is to resolve issue I ever sent to community for help: 
https://lists.denx.de/pipermail/u-boot/2020-September/426981.html

Thanks & Regards,
Ran
Bin Meng Nov. 10, 2020, 5:30 a.m. UTC | #4
Hi Ran,

On Tue, Sep 22, 2020 at 1:02 PM Ran Wang <ran.wang_1@nxp.com> wrote:
>
> In functiion xhci_bulk_tx(), when buffer cross 64KB boundary, it will

typo: function

> send request in more than 1 Transfer TRB by chaining them, but then handle
> only 1 event TRB to mark request completed.
>
> However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe xhci
> controller will generated more than 1 event TRB sometimes, this cause that

I am not sure if it's fair to say this is Layerscape specific
behavior. Based on the xHCI spec, the spec indicates 1 event trb
should be generated when IOC/ISP flag is set to 1 or an error occurs.
I will see if I can reproduce your issue on an x86 board.

> function mishandle event TRB in next round call, then system hang due to
> BUG() checking.
>
> This patch adds a loop to make sure the event TRB for last Transfer TRB has
> to be handled in time.
>
> Signed-off-by: Ran Wang <ran.wang_1@nxp.com>
> ---
>  drivers/usb/host/xhci-ring.c | 17 ++++++++++++++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index 092ed6e..d77e058 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -578,10 +578,13 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe,
>         int ret;
>         u32 trb_fields[4];
>         u64 val_64 = virt_to_phys(buffer);
> +       void *last_transfer_trb_addr;
> +       int available_length;
>
>         debug("dev=%p, pipe=%lx, buffer=%p, length=%d\n",
>                 udev, pipe, buffer, length);
>
> +       available_length = length;
>         ep_index = usb_pipe_ep_index(pipe);
>         virt_dev = ctrl->devs[slot_id];
>
> @@ -701,7 +704,7 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe,
>                 trb_fields[2] = length_field;
>                 trb_fields[3] = field | (TRB_NORMAL << TRB_TYPE_SHIFT);
>
> -               queue_trb(ctrl, ring, (num_trbs > 1), trb_fields);
> +               last_transfer_trb_addr = queue_trb(ctrl, ring, (num_trbs > 1), trb_fields);
>
>                 --num_trbs;
>
> @@ -714,6 +717,7 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe,
>
>         giveback_first_trb(udev, ep_index, start_cycle, start_trb);
>
> +again:
>         event = xhci_wait_for_event(ctrl, TRB_TRANSFER);
>         if (!event) {
>                 debug("XHCI bulk transfer timed out, aborting...\n");
> @@ -722,14 +726,21 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe,
>                 udev->act_len = 0;
>                 return -ETIMEDOUT;
>         }
> -       field = le32_to_cpu(event->trans_event.flags);
>
> +       if ((void *)event->trans_event.buffer != last_transfer_trb_addr) {

This should be:

if ((void *)le64_to_cpu(event->trans_event.buffer) != last_transfer_trb_addr)

> +               available_length -=
> +                       (int)EVENT_TRB_LEN(le32_to_cpu(event->trans_event.transfer_len));
> +               xhci_acknowledge_event(ctrl);
> +               goto again;
> +       }
> +
> +       field = le32_to_cpu(event->trans_event.flags);
>         BUG_ON(TRB_TO_SLOT_ID(field) != slot_id);
>         BUG_ON(TRB_TO_EP_INDEX(field) != ep_index);
>         BUG_ON(*(void **)(uintptr_t)le64_to_cpu(event->trans_event.buffer) -
>                 buffer > (size_t)length);
>
> -       record_transfer_result(udev, event, length);
> +       record_transfer_result(udev, event, available_length);
>         xhci_acknowledge_event(ctrl);
>         xhci_inval_cache((uintptr_t)buffer, length);
>

Regards,
Bin
Bin Meng Nov. 10, 2020, 5:43 a.m. UTC | #5
Hi Ran,

On Tue, Nov 10, 2020 at 1:30 PM Bin Meng <bmeng.cn@gmail.com> wrote:
>
> Hi Ran,
>
> On Tue, Sep 22, 2020 at 1:02 PM Ran Wang <ran.wang_1@nxp.com> wrote:
> >
> > In functiion xhci_bulk_tx(), when buffer cross 64KB boundary, it will
>
> typo: function
>
> > send request in more than 1 Transfer TRB by chaining them, but then handle
> > only 1 event TRB to mark request completed.
> >
> > However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe xhci
> > controller will generated more than 1 event TRB sometimes, this cause that
>
> I am not sure if it's fair to say this is Layerscape specific
> behavior. Based on the xHCI spec, the spec indicates 1 event trb
> should be generated when IOC/ISP flag is set to 1 or an error occurs.

Ah, ISP flag is set if the pipe is from an IN endpoint. Currently we have:

        /* Only set interrupt on short packet for IN endpoints */
        if (usb_pipein(pipe))
            field |= TRB_ISP;

Can you verify that if removing the above codes, and without your
changes in this patch, the original issue can be resolved on LS1028?

> I will see if I can reproduce your issue on an x86 board.
>

Note this patch does not apply on top of u-boot/master. Please rebase.

> > function mishandle event TRB in next round call, then system hang due to
> > BUG() checking.
> >
> > This patch adds a loop to make sure the event TRB for last Transfer TRB has
> > to be handled in time.
> >
> > Signed-off-by: Ran Wang <ran.wang_1@nxp.com>
> > ---
> >  drivers/usb/host/xhci-ring.c | 17 ++++++++++++++---
> >  1 file changed, 14 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> > index 092ed6e..d77e058 100644
> > --- a/drivers/usb/host/xhci-ring.c
> > +++ b/drivers/usb/host/xhci-ring.c
> > @@ -578,10 +578,13 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe,
> >         int ret;
> >         u32 trb_fields[4];
> >         u64 val_64 = virt_to_phys(buffer);
> > +       void *last_transfer_trb_addr;
> > +       int available_length;
> >
> >         debug("dev=%p, pipe=%lx, buffer=%p, length=%d\n",
> >                 udev, pipe, buffer, length);
> >
> > +       available_length = length;
> >         ep_index = usb_pipe_ep_index(pipe);
> >         virt_dev = ctrl->devs[slot_id];
> >
> > @@ -701,7 +704,7 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe,
> >                 trb_fields[2] = length_field;
> >                 trb_fields[3] = field | (TRB_NORMAL << TRB_TYPE_SHIFT);
> >
> > -               queue_trb(ctrl, ring, (num_trbs > 1), trb_fields);
> > +               last_transfer_trb_addr = queue_trb(ctrl, ring, (num_trbs > 1), trb_fields);
> >
> >                 --num_trbs;
> >
> > @@ -714,6 +717,7 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe,
> >
> >         giveback_first_trb(udev, ep_index, start_cycle, start_trb);
> >
> > +again:
> >         event = xhci_wait_for_event(ctrl, TRB_TRANSFER);
> >         if (!event) {
> >                 debug("XHCI bulk transfer timed out, aborting...\n");
> > @@ -722,14 +726,21 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe,
> >                 udev->act_len = 0;
> >                 return -ETIMEDOUT;
> >         }
> > -       field = le32_to_cpu(event->trans_event.flags);
> >
> > +       if ((void *)event->trans_event.buffer != last_transfer_trb_addr) {
>
> This should be:
>
> if ((void *)le64_to_cpu(event->trans_event.buffer) != last_transfer_trb_addr)
>
> > +               available_length -=
> > +                       (int)EVENT_TRB_LEN(le32_to_cpu(event->trans_event.transfer_len));
> > +               xhci_acknowledge_event(ctrl);
> > +               goto again;
> > +       }
> > +
> > +       field = le32_to_cpu(event->trans_event.flags);
> >         BUG_ON(TRB_TO_SLOT_ID(field) != slot_id);
> >         BUG_ON(TRB_TO_EP_INDEX(field) != ep_index);
> >         BUG_ON(*(void **)(uintptr_t)le64_to_cpu(event->trans_event.buffer) -
> >                 buffer > (size_t)length);
> >
> > -       record_transfer_result(udev, event, length);
> > +       record_transfer_result(udev, event, available_length);
> >         xhci_acknowledge_event(ctrl);
> >         xhci_inval_cache((uintptr_t)buffer, length);
> >

Regards,
Bin
Ran Wang Nov. 10, 2020, 7:36 a.m. UTC | #6
Hi Bin,

On Tuesday, November 10, 2020 1:43 PM Bin Meng wrote:
> 
> Hi Ran,
> 
> On Tue, Nov 10, 2020 at 1:30 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> >
> > Hi Ran,
> >
> > On Tue, Sep 22, 2020 at 1:02 PM Ran Wang <ran.wang_1@nxp.com> wrote:
> > >
> > > In functiion xhci_bulk_tx(), when buffer cross 64KB boundary, it
> > > will
> >
> > typo: function

Got it.

> > > send request in more than 1 Transfer TRB by chaining them, but then
> > > handle only 1 event TRB to mark request completed.
> > >
> > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe
> > > xhci controller will generated more than 1 event TRB sometimes, this
> > > cause that
> >
> > I am not sure if it's fair to say this is Layerscape specific
> > behavior. Based on the xHCI spec, the spec indicates 1 event trb
> > should be generated when IOC/ISP flag is set to 1 or an error occurs.
> 
> Ah, ISP flag is set if the pipe is from an IN endpoint. Currently we have:
> 
>         /* Only set interrupt on short packet for IN endpoints */
>         if (usb_pipein(pipe))
>             field |= TRB_ISP;
> 
> Can you verify that if removing the above codes, and without your changes in
> this patch, the original issue can be resolved on LS1028?

Bingo, removing above codes can resolve my issue, too

> > I will see if I can reproduce your issue on an x86 board.
> >
> 
> Note this patch does not apply on top of u-boot/master. Please rebase.

Sure, I can rebase my patch, but I am nor sure my solution is still worth:
xHCI says "The detection of a USB Short Packet (i.e. the actual number of
bytes received was less than the expected number of bytes defined by the
Transfer TRB) during a transfer does not necessarily generate an Event. "

So does it mean above workaround you suggest would not violate spec, either
(although current Linux kernel driver still set ISP for this case, but may have
a more robust mechanism for event TRB handling)?


> > > function mishandle event TRB in next round call, then system hang
> > > due to
> > > BUG() checking.
> > >
> > > This patch adds a loop to make sure the event TRB for last Transfer
> > > TRB has to be handled in time.
> > >
> > > Signed-off-by: Ran Wang <ran.wang_1@nxp.com>
> > > ---
> > >  drivers/usb/host/xhci-ring.c | 17 ++++++++++++++---
> > >  1 file changed, 14 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/usb/host/xhci-ring.c
> > > b/drivers/usb/host/xhci-ring.c index 092ed6e..d77e058 100644
> > > --- a/drivers/usb/host/xhci-ring.c
> > > +++ b/drivers/usb/host/xhci-ring.c
> > > @@ -578,10 +578,13 @@ int xhci_bulk_tx(struct usb_device *udev,
> unsigned long pipe,
> > >         int ret;
> > >         u32 trb_fields[4];
> > >         u64 val_64 = virt_to_phys(buffer);
> > > +       void *last_transfer_trb_addr;
> > > +       int available_length;
> > >
> > >         debug("dev=%p, pipe=%lx, buffer=%p, length=%d\n",
> > >                 udev, pipe, buffer, length);
> > >
> > > +       available_length = length;
> > >         ep_index = usb_pipe_ep_index(pipe);
> > >         virt_dev = ctrl->devs[slot_id];
> > >
> > > @@ -701,7 +704,7 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned
> long pipe,
> > >                 trb_fields[2] = length_field;
> > >                 trb_fields[3] = field | (TRB_NORMAL <<
> > > TRB_TYPE_SHIFT);
> > >
> > > -               queue_trb(ctrl, ring, (num_trbs > 1), trb_fields);
> > > +               last_transfer_trb_addr = queue_trb(ctrl, ring,
> > > + (num_trbs > 1), trb_fields);
> > >
> > >                 --num_trbs;
> > >
> > > @@ -714,6 +717,7 @@ int xhci_bulk_tx(struct usb_device *udev,
> > > unsigned long pipe,
> > >
> > >         giveback_first_trb(udev, ep_index, start_cycle, start_trb);
> > >
> > > +again:
> > >         event = xhci_wait_for_event(ctrl, TRB_TRANSFER);
> > >         if (!event) {
> > >                 debug("XHCI bulk transfer timed out,
> > > aborting...\n"); @@ -722,14 +726,21 @@ int xhci_bulk_tx(struct
> usb_device *udev, unsigned long pipe,
> > >                 udev->act_len = 0;
> > >                 return -ETIMEDOUT;
> > >         }
> > > -       field = le32_to_cpu(event->trans_event.flags);
> > >
> > > +       if ((void *)event->trans_event.buffer !=
> > > + last_transfer_trb_addr) {
> >
> > This should be:
> >
> > if ((void *)le64_to_cpu(event->trans_event.buffer) !=
> > last_transfer_trb_addr)

Got it, will update this in next version (if you think my solution is still acceptable for this issue).

Thanks & regards,
Ran

> > > +               available_length -=
> > > +
> (int)EVENT_TRB_LEN(le32_to_cpu(event->trans_event.transfer_len));
> > > +               xhci_acknowledge_event(ctrl);
> > > +               goto again;
> > > +       }
> > > +
> > > +       field = le32_to_cpu(event->trans_event.flags);
> > >         BUG_ON(TRB_TO_SLOT_ID(field) != slot_id);
> > >         BUG_ON(TRB_TO_EP_INDEX(field) != ep_index);
> > >         BUG_ON(*(void
> **)(uintptr_t)le64_to_cpu(event->trans_event.buffer) -
> > >                 buffer > (size_t)length);
> > >
> > > -       record_transfer_result(udev, event, length);
> > > +       record_transfer_result(udev, event, available_length);
> > >         xhci_acknowledge_event(ctrl);
> > >         xhci_inval_cache((uintptr_t)buffer, length);
> > >
> 
> Regards,
> Bin
Bin Meng Nov. 10, 2020, 7:47 a.m. UTC | #7
Hi Ran,

On Tue, Nov 10, 2020 at 3:36 PM Ran Wang <ran.wang_1@nxp.com> wrote:
>
> Hi Bin,
>
> On Tuesday, November 10, 2020 1:43 PM Bin Meng wrote:
> >
> > Hi Ran,
> >
> > On Tue, Nov 10, 2020 at 1:30 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> > >
> > > Hi Ran,
> > >
> > > On Tue, Sep 22, 2020 at 1:02 PM Ran Wang <ran.wang_1@nxp.com> wrote:
> > > >
> > > > In functiion xhci_bulk_tx(), when buffer cross 64KB boundary, it
> > > > will
> > >
> > > typo: function
>
> Got it.
>
> > > > send request in more than 1 Transfer TRB by chaining them, but then
> > > > handle only 1 event TRB to mark request completed.
> > > >
> > > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe
> > > > xhci controller will generated more than 1 event TRB sometimes, this
> > > > cause that
> > >
> > > I am not sure if it's fair to say this is Layerscape specific
> > > behavior. Based on the xHCI spec, the spec indicates 1 event trb
> > > should be generated when IOC/ISP flag is set to 1 or an error occurs.
> >
> > Ah, ISP flag is set if the pipe is from an IN endpoint. Currently we have:
> >
> >         /* Only set interrupt on short packet for IN endpoints */
> >         if (usb_pipein(pipe))
> >             field |= TRB_ISP;
> >
> > Can you verify that if removing the above codes, and without your changes in
> > this patch, the original issue can be resolved on LS1028?
>
> Bingo, removing above codes can resolve my issue, too

Thank you for your testing.

>
> > > I will see if I can reproduce your issue on an x86 board.
> > >
> >
> > Note this patch does not apply on top of u-boot/master. Please rebase.
>
> Sure, I can rebase my patch, but I am nor sure my solution is still worth:
> xHCI says "The detection of a USB Short Packet (i.e. the actual number of
> bytes received was less than the expected number of bytes defined by the
> Transfer TRB) during a transfer does not necessarily generate an Event. "
>

Yes, that's what the spec says. So in your case, can you add some logs
to verify that there is a transfer event trb generated by the xHC and
the completion code is 13 (short packet)? You can add some debug
output in record_transfer_result().

> So does it mean above workaround you suggest would not violate spec, either
> (although current Linux kernel driver still set ISP for this case, but may have
> a more robust mechanism for event TRB handling)?

I need to do more testing to see if this is a LS1028 behavior or
generic xHCI behavior, ie: on an x86 board. Do you happen to know if
there is any errata document for LS1028 that is related to this?

>
>
> > > > function mishandle event TRB in next round call, then system hang
> > > > due to
> > > > BUG() checking.
> > > >
> > > > This patch adds a loop to make sure the event TRB for last Transfer
> > > > TRB has to be handled in time.
> > > >
> > > > Signed-off-by: Ran Wang <ran.wang_1@nxp.com>
> > > > ---
> > > >  drivers/usb/host/xhci-ring.c | 17 ++++++++++++++---
> > > >  1 file changed, 14 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/usb/host/xhci-ring.c
> > > > b/drivers/usb/host/xhci-ring.c index 092ed6e..d77e058 100644
> > > > --- a/drivers/usb/host/xhci-ring.c
> > > > +++ b/drivers/usb/host/xhci-ring.c
> > > > @@ -578,10 +578,13 @@ int xhci_bulk_tx(struct usb_device *udev,
> > unsigned long pipe,
> > > >         int ret;
> > > >         u32 trb_fields[4];
> > > >         u64 val_64 = virt_to_phys(buffer);
> > > > +       void *last_transfer_trb_addr;
> > > > +       int available_length;
> > > >
> > > >         debug("dev=%p, pipe=%lx, buffer=%p, length=%d\n",
> > > >                 udev, pipe, buffer, length);
> > > >
> > > > +       available_length = length;
> > > >         ep_index = usb_pipe_ep_index(pipe);
> > > >         virt_dev = ctrl->devs[slot_id];
> > > >
> > > > @@ -701,7 +704,7 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned
> > long pipe,
> > > >                 trb_fields[2] = length_field;
> > > >                 trb_fields[3] = field | (TRB_NORMAL <<
> > > > TRB_TYPE_SHIFT);
> > > >
> > > > -               queue_trb(ctrl, ring, (num_trbs > 1), trb_fields);
> > > > +               last_transfer_trb_addr = queue_trb(ctrl, ring,
> > > > + (num_trbs > 1), trb_fields);
> > > >
> > > >                 --num_trbs;
> > > >
> > > > @@ -714,6 +717,7 @@ int xhci_bulk_tx(struct usb_device *udev,
> > > > unsigned long pipe,
> > > >
> > > >         giveback_first_trb(udev, ep_index, start_cycle, start_trb);
> > > >
> > > > +again:
> > > >         event = xhci_wait_for_event(ctrl, TRB_TRANSFER);
> > > >         if (!event) {
> > > >                 debug("XHCI bulk transfer timed out,
> > > > aborting...\n"); @@ -722,14 +726,21 @@ int xhci_bulk_tx(struct
> > usb_device *udev, unsigned long pipe,
> > > >                 udev->act_len = 0;
> > > >                 return -ETIMEDOUT;
> > > >         }
> > > > -       field = le32_to_cpu(event->trans_event.flags);
> > > >
> > > > +       if ((void *)event->trans_event.buffer !=
> > > > + last_transfer_trb_addr) {
> > >
> > > This should be:
> > >
> > > if ((void *)le64_to_cpu(event->trans_event.buffer) !=
> > > last_transfer_trb_addr)
>
> Got it, will update this in next version (if you think my solution is still acceptable for this issue).
>
> Thanks & regards,
> Ran
>
> > > > +               available_length -=
> > > > +
> > (int)EVENT_TRB_LEN(le32_to_cpu(event->trans_event.transfer_len));
> > > > +               xhci_acknowledge_event(ctrl);
> > > > +               goto again;
> > > > +       }
> > > > +
> > > > +       field = le32_to_cpu(event->trans_event.flags);
> > > >         BUG_ON(TRB_TO_SLOT_ID(field) != slot_id);
> > > >         BUG_ON(TRB_TO_EP_INDEX(field) != ep_index);
> > > >         BUG_ON(*(void
> > **)(uintptr_t)le64_to_cpu(event->trans_event.buffer) -
> > > >                 buffer > (size_t)length);
> > > >
> > > > -       record_transfer_result(udev, event, length);
> > > > +       record_transfer_result(udev, event, available_length);
> > > >         xhci_acknowledge_event(ctrl);
> > > >         xhci_inval_cache((uintptr_t)buffer, length);
> > > >

Regards,
Bin
Ran Wang Nov. 10, 2020, 8:05 a.m. UTC | #8
Hi Bin,

On Tuesday, November 10, 2020 3:47 PM Bin Meng wrote:
> 
> Hi Ran,
<snip>
> > > > > send request in more than 1 Transfer TRB by chaining them, but
> > > > > then handle only 1 event TRB to mark request completed.
> > > > >
> > > > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we
> > > > > observe xhci controller will generated more than 1 event TRB
> > > > > sometimes, this cause that
> > > >
> > > > I am not sure if it's fair to say this is Layerscape specific
> > > > behavior. Based on the xHCI spec, the spec indicates 1 event trb
> > > > should be generated when IOC/ISP flag is set to 1 or an error occurs.
> > >
> > > Ah, ISP flag is set if the pipe is from an IN endpoint. Currently we have:
> > >
> > >         /* Only set interrupt on short packet for IN endpoints */
> > >         if (usb_pipein(pipe))
> > >             field |= TRB_ISP;
> > >
> > > Can you verify that if removing the above codes, and without your
> > > changes in this patch, the original issue can be resolved on LS1028?
> >
> > Bingo, removing above codes can resolve my issue, too
> 
> Thank you for your testing.
> 
> >
> > > > I will see if I can reproduce your issue on an x86 board.
> > > >
> > >
> > > Note this patch does not apply on top of u-boot/master. Please rebase.
> >
> > Sure, I can rebase my patch, but I am nor sure my solution is still worth:
> > xHCI says "The detection of a USB Short Packet (i.e. the actual number
> > of bytes received was less than the expected number of bytes defined
> > by the Transfer TRB) during a transfer does not necessarily generate an Event.
> "
> >
> 
> Yes, that's what the spec says. So in your case, can you add some logs to verify
> that there is a transfer event trb generated by the xHC and the completion
> code is 13 (short packet)? You can add some debug output in
> record_transfer_result().

Sure, let me try this can get back to you later.

> > So does it mean above workaround you suggest would not violate spec,
> > either (although current Linux kernel driver still set ISP for this
> > case, but may have a more robust mechanism for event TRB handling)?
> 
> I need to do more testing to see if this is a LS1028 behavior or generic xHCI
> behavior, ie: on an x86 board. Do you happen to know if there is any errata
> document for LS1028 that is related to this?

I don’t see any errata for LS1028A for this case.

Actually this issue only randomly happens in U-Boot + USB dongle
(I have tried 3 different brands, such as TP-LINK UE300, etc).
And U-Boot + USB mass storage devicecannot reproduce this issue.

Under Linux, no such issue observed neither (with those dongles), all of them works fine.
That's why I decide to improve event TRB handling logic here.

Below is the U-Boot test commands I used (repeatedly do tftp file transfer), issue would happen in less 5 minutes.
for your reference:

setenv tt 'usb reset&&setenv ethact asix_eth&&tftp a0000000 tmp/1MB_test_file'
setenv ttt 'run tt&&run tt&&run tt&&run tt&&run tt&&run tt&&run tt&&run tt&&run tt&&run tt'
setenv tttt 'run ttt&&run ttt&&run ttt&&run ttt&&run ttt&&run ttt&&run ttt&&run ttt&&run ttt'
setenv u 'run tttt&&run tttt&&run tttt&&run tttt&&run tttt&&run tttt&&run tttt&&run tttt&&run tttt&&run tttt&&run tttt'
run u

Regards,
Ran
Ran Wang Nov. 10, 2020, 8:30 a.m. UTC | #9
Hi Bin,

On Tuesday, November 10, 2020 4:06 PM Ran Wang wrote: 
> Hi Bin,
> 
> On Tuesday, November 10, 2020 3:47 PM Bin Meng wrote:
> >
> > Hi Ran,
> <snip>
> > > > > > send request in more than 1 Transfer TRB by chaining them, but
> > > > > > then handle only 1 event TRB to mark request completed.
> > > > > >
> > > > > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we
> > > > > > observe xhci controller will generated more than 1 event TRB
> > > > > > sometimes, this cause that
> > > > >
> > > > > I am not sure if it's fair to say this is Layerscape specific
> > > > > behavior. Based on the xHCI spec, the spec indicates 1 event trb
> > > > > should be generated when IOC/ISP flag is set to 1 or an error occurs.
> > > >
> > > > Ah, ISP flag is set if the pipe is from an IN endpoint. Currently we have:
> > > >
> > > >         /* Only set interrupt on short packet for IN endpoints */
> > > >         if (usb_pipein(pipe))
> > > >             field |= TRB_ISP;
> > > >
> > > > Can you verify that if removing the above codes, and without your
> > > > changes in this patch, the original issue can be resolved on LS1028?
> > >
> > > Bingo, removing above codes can resolve my issue, too
> >
> > Thank you for your testing.
> >
> > >
> > > > > I will see if I can reproduce your issue on an x86 board.
> > > > >
> > > >
> > > > Note this patch does not apply on top of u-boot/master. Please rebase.
> > >
> > > Sure, I can rebase my patch, but I am nor sure my solution is still worth:
> > > xHCI says "The detection of a USB Short Packet (i.e. the actual
> > > number of bytes received was less than the expected number of bytes
> > > defined by the Transfer TRB) during a transfer does not necessarily generate
> an Event.
> > "
> > >
> >
> > Yes, that's what the spec says. So in your case, can you add some logs
> > to verify that there is a transfer event trb generated by the xHC and
> > the completion code is 13 (short packet)? You can add some debug
> > output in record_transfer_result().
> 
> Sure, let me try this can get back to you later.

Confirm completion code is 13 when issue happen (please search ' complete_code:13'):

dev=00000000fbb4f3c0, pipe=c0010283, buffer=00000000fbb4fd80, length=2048----------------------------buffer would cross 64KB boundary, so we will send request in more than 1 TRB, this is the key issue trigger condition
xhci_bulk_tx()#0.1.running_total:0x280
xhci_bulk_tx()#0.2.trb_buff_len:0x280
xhci_bulk_tx()#0.3.running_total:0x280
xhci_bulk_tx()#0.4.num_trbs:0x2--------------------------2 Transfer TRB
xhci_bulk_tx()#0.5.running_total:0x10280
xhci_bulk_tx()#0.start_trb:0x00000000fbb47140
----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47140----------Assemble 1st Transfer TRB
xhci_bulk_tx()#0.addr:0xfbb4fd80
xhci_bulk_tx()#0.trb_buff_len:0x280
xhci_bulk_tx()#0.running_total:0x280
xhci_bulk_tx()#0.length:0x800
xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000
----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47150----------Assemble 2nd Transfer TRB
xhci_bulk_tx()#0.addr:0xfbb50000
xhci_bulk_tx()#0.trb_buff_len:0x580
xhci_bulk_tx()#0.running_total:0x800
xhci_bulk_tx()#0.length:0x800
xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000
xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb47140----------------handle event TRB for 1st Transfer TRB
xhci_bulk_tx()#0.event->trans_event.transfer_len:0xd000180
xhci_bulk_tx()#0.event->trans_event.flags:0x1058001
xhci_bulk_tx()#0.event->len:384
xhci_bulk_tx()#0.event->complete_code:13
xhci_bulk_tx()#1.field:0x1058001
xhci_bulk_tx()#2.TRB_TO_EP_ENDEX(field):0x4
xhci_bulk_tx()#3.ep_index:0x4----------------------------------------------------------xhci_bulk_tx() return

dev=00000000fbb4f3c0, pipe=c0018203, buffer=00000000fbb2f9c0, length=110----------an other call comming, and for different EP (ep_index = 05)
xhci_bulk_tx()#0.1.running_total:0x640
xhci_bulk_tx()#0.2.trb_buff_len:0x640
xhci_bulk_tx()#0.3.running_total:0x640
xhci_bulk_tx()#0.4.num_trbs:0x1
xhci_bulk_tx()#0.5.running_total:0x640
xhci_bulk_tx()#0.start_trb:0x00000000fbb47610
----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47610---------queue 1st transfer TRB
xhci_bulk_tx()#0.addr:0xfbb2f9c0
xhci_bulk_tx()#0.trb_buff_len:0x6e
xhci_bulk_tx()#0.running_total:0x6e
xhci_bulk_tx()#0.length:0x6e
xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000
xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb47150------------hand event TRB, note that this buffer is for last Transfer TRB in last call (see above '2nd Transfer TRB')
xhci_bulk_tx()#0.event->trans_event.transfer_len:0x1000000
xhci_bulk_tx()#0.event->trans_event.flags:0x1058000
xhci_bulk_tx()#0.event->len:0
xhci_bulk_tx()#0.event->complete_code:1
xhci_bulk_tx()#1.field:0x1058000
xhci_bulk_tx()#2.TRB_TO_EP_ENDEX(field):0x4----------------ep_index is 4 rather than 5, then cause BUG()
xhci_bulk_tx()#3.ep_index:0x5-------------------------------------
ep_index mis-match wait for event again!!!!!!!!!----------------------------------
BUG at drivers/usb/host/xhci-ring.c:759/xhci_bulk_tx()!
BUG!
### ERROR ### Please RESET the board ###

Regards,
Ran
Bin Meng Nov. 10, 2020, 9:02 a.m. UTC | #10
Hi Ran,

On Tue, Nov 10, 2020 at 4:31 PM Ran Wang <ran.wang_1@nxp.com> wrote:
>
> Hi Bin,
>
> On Tuesday, November 10, 2020 4:06 PM Ran Wang wrote:
> > Hi Bin,
> >
> > On Tuesday, November 10, 2020 3:47 PM Bin Meng wrote:
> > >
> > > Hi Ran,
> > <snip>
> > > > > > > send request in more than 1 Transfer TRB by chaining them, but
> > > > > > > then handle only 1 event TRB to mark request completed.
> > > > > > >
> > > > > > > However, on Layerscape platforms (LS1028A, LS1088A, etc), we
> > > > > > > observe xhci controller will generated more than 1 event TRB
> > > > > > > sometimes, this cause that
> > > > > >
> > > > > > I am not sure if it's fair to say this is Layerscape specific
> > > > > > behavior. Based on the xHCI spec, the spec indicates 1 event trb
> > > > > > should be generated when IOC/ISP flag is set to 1 or an error occurs.
> > > > >
> > > > > Ah, ISP flag is set if the pipe is from an IN endpoint. Currently we have:
> > > > >
> > > > >         /* Only set interrupt on short packet for IN endpoints */
> > > > >         if (usb_pipein(pipe))
> > > > >             field |= TRB_ISP;
> > > > >
> > > > > Can you verify that if removing the above codes, and without your
> > > > > changes in this patch, the original issue can be resolved on LS1028?
> > > >
> > > > Bingo, removing above codes can resolve my issue, too
> > >
> > > Thank you for your testing.
> > >
> > > >
> > > > > > I will see if I can reproduce your issue on an x86 board.
> > > > > >
> > > > >
> > > > > Note this patch does not apply on top of u-boot/master. Please rebase.
> > > >
> > > > Sure, I can rebase my patch, but I am nor sure my solution is still worth:
> > > > xHCI says "The detection of a USB Short Packet (i.e. the actual
> > > > number of bytes received was less than the expected number of bytes
> > > > defined by the Transfer TRB) during a transfer does not necessarily generate
> > an Event.
> > > "
> > > >
> > >
> > > Yes, that's what the spec says. So in your case, can you add some logs
> > > to verify that there is a transfer event trb generated by the xHC and
> > > the completion code is 13 (short packet)? You can add some debug
> > > output in record_transfer_result().
> >
> > Sure, let me try this can get back to you later.
>
> Confirm completion code is 13 when issue happen (please search ' complete_code:13'):

Thank you very much for your testing.

>
> dev=00000000fbb4f3c0, pipe=c0010283, buffer=00000000fbb4fd80, length=2048----------------------------buffer would cross 64KB boundary, so we will send request in more than 1 TRB, this is the key issue trigger condition
> xhci_bulk_tx()#0.1.running_total:0x280
> xhci_bulk_tx()#0.2.trb_buff_len:0x280
> xhci_bulk_tx()#0.3.running_total:0x280
> xhci_bulk_tx()#0.4.num_trbs:0x2--------------------------2 Transfer TRB
> xhci_bulk_tx()#0.5.running_total:0x10280
> xhci_bulk_tx()#0.start_trb:0x00000000fbb47140
> ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47140----------Assemble 1st Transfer TRB
> xhci_bulk_tx()#0.addr:0xfbb4fd80
> xhci_bulk_tx()#0.trb_buff_len:0x280
> xhci_bulk_tx()#0.running_total:0x280
> xhci_bulk_tx()#0.length:0x800
> xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000
> ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47150----------Assemble 2nd Transfer TRB
> xhci_bulk_tx()#0.addr:0xfbb50000
> xhci_bulk_tx()#0.trb_buff_len:0x580
> xhci_bulk_tx()#0.running_total:0x800
> xhci_bulk_tx()#0.length:0x800
> xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000
> xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb47140----------------handle event TRB for 1st Transfer TRB
> xhci_bulk_tx()#0.event->trans_event.transfer_len:0xd000180
> xhci_bulk_tx()#0.event->trans_event.flags:0x1058001
> xhci_bulk_tx()#0.event->len:384
> xhci_bulk_tx()#0.event->complete_code:13
> xhci_bulk_tx()#1.field:0x1058001
> xhci_bulk_tx()#2.TRB_TO_EP_ENDEX(field):0x4
> xhci_bulk_tx()#3.ep_index:0x4----------------------------------------------------------xhci_bulk_tx() return
>
> dev=00000000fbb4f3c0, pipe=c0018203, buffer=00000000fbb2f9c0, length=110----------an other call comming, and for different EP (ep_index = 05)
> xhci_bulk_tx()#0.1.running_total:0x640
> xhci_bulk_tx()#0.2.trb_buff_len:0x640
> xhci_bulk_tx()#0.3.running_total:0x640
> xhci_bulk_tx()#0.4.num_trbs:0x1
> xhci_bulk_tx()#0.5.running_total:0x640
> xhci_bulk_tx()#0.start_trb:0x00000000fbb47610
> ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47610---------queue 1st transfer TRB
> xhci_bulk_tx()#0.addr:0xfbb2f9c0
> xhci_bulk_tx()#0.trb_buff_len:0x6e
> xhci_bulk_tx()#0.running_total:0x6e
> xhci_bulk_tx()#0.length:0x6e
> xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000
> xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb47150------------hand event TRB, note that this buffer is for last Transfer TRB in last call (see above '2nd Transfer TRB')
> xhci_bulk_tx()#0.event->trans_event.transfer_len:0x1000000
> xhci_bulk_tx()#0.event->trans_event.flags:0x1058000
> xhci_bulk_tx()#0.event->len:0
> xhci_bulk_tx()#0.event->complete_code:1
> xhci_bulk_tx()#1.field:0x1058000
> xhci_bulk_tx()#2.TRB_TO_EP_ENDEX(field):0x4----------------ep_index is 4 rather than 5, then cause BUG()
> xhci_bulk_tx()#3.ep_index:0x5-------------------------------------
> ep_index mis-match wait for event again!!!!!!!!!----------------------------------
> BUG at drivers/usb/host/xhci-ring.c:759/xhci_bulk_tx()!
> BUG!
> ### ERROR ### Please RESET the board ###
>

Based on your log, it matches what I suspected and we are getting
close to the root cause.

Could you please try the following simple test case to see if it can
trigger the issue with a USB disk?
After U-Boot boot on LS1028, with a USB flash disk attached, then type
"usb start" to see if the USB disk can be recognized by U-Boot?

commit 60e68ed667362870c20b36ae26dacc1af903e81e
Author: Bin Meng <bmeng.cn@gmail.com>
Date:   Tue Nov 10 16:19:06 2020 +0800

    WIP: usb: A simple test case to trigger the TRB 64K boundary issue
with mass storage device

    This should not be applied as it aims to provide a simple test case to
    trigger the TRB 64K boundary issue as reported by Ran Wang @ NXP.

    Signed-off-by: Bin Meng <bmeng.cn@gmail.com>

diff --git a/common/usb_storage.c b/common/usb_storage.c
index ff25441..c8aec2e 100644
--- a/common/usb_storage.c
+++ b/common/usb_storage.c
@@ -710,7 +710,15 @@ static int usb_stor_BBB_transport(struct scsi_cmd
*srb, struct us_data *us)
        int dir_in;
        int actlen, data_actlen;
        unsigned int pipe, pipein, pipeout;
+#if 0
        ALLOC_CACHE_ALIGN_BUFFER(struct umass_bbb_csw, csw, 1);
+#else
+       struct umass_bbb_csw *csw_org, *csw;
+       csw_org = malloc(0x10000 + UMASS_BBB_CSW_SIZE);
+       csw = (struct umass_bbb_csw *)roundup((ulong)csw_org, 0x10000);
+       csw = (struct umass_bbb_csw *)((ulong)csw - 2);
+       printf("csw org %p, adjusted to %p\n", csw_org, csw);
+#endif
 #ifdef BBB_XPORT_TRACE
        unsigned char *ptr;
        int index;
@@ -824,6 +832,7 @@ again:
                return USB_STOR_TRANSPORT_FAILED;
        }

+       free(csw_org);
        return result;
 }


Regards,
Bin
Ran Wang Nov. 10, 2020, 9:40 a.m. UTC | #11
Hi Bin,

On Tuesday, November 10, 2020 5:03 PM Bin Meng wrote:
> 
> Hi Ran,
> 
> On Tue, Nov 10, 2020 at 4:31 PM Ran Wang <ran.wang_1@nxp.com> wrote:
> >
> > Hi Bin,
> >
> > On Tuesday, November 10, 2020 4:06 PM Ran Wang wrote:
> > > Hi Bin,

<snip>

> >
> 
> Based on your log, it matches what I suspected and we are getting close to the
> root cause.
> 
> Could you please try the following simple test case to see if it can trigger the
> issue with a USB disk?
> After U-Boot boot on LS1028, with a USB flash disk attached, then type "usb
> start" to see if the USB disk can be recognized by U-Boot?
> 
> commit 60e68ed667362870c20b36ae26dacc1af903e81e
> Author: Bin Meng <bmeng.cn@gmail.com>
> Date:   Tue Nov 10 16:19:06 2020 +0800
> 
>     WIP: usb: A simple test case to trigger the TRB 64K boundary issue with
> mass storage device
> 
>     This should not be applied as it aims to provide a simple test case to
>     trigger the TRB 64K boundary issue as reported by Ran Wang @ NXP.
> 
>     Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
> 
> diff --git a/common/usb_storage.c b/common/usb_storage.c index
> ff25441..c8aec2e 100644
> --- a/common/usb_storage.c
> +++ b/common/usb_storage.c
> @@ -710,7 +710,15 @@ static int usb_stor_BBB_transport(struct scsi_cmd
> *srb, struct us_data *us)
>         int dir_in;
>         int actlen, data_actlen;
>         unsigned int pipe, pipein, pipeout;
> +#if 0
>         ALLOC_CACHE_ALIGN_BUFFER(struct umass_bbb_csw, csw, 1);
> +#else
> +       struct umass_bbb_csw *csw_org, *csw;
> +       csw_org = malloc(0x10000 + UMASS_BBB_CSW_SIZE);
> +       csw = (struct umass_bbb_csw *)roundup((ulong)csw_org, 0x10000);
> +       csw = (struct umass_bbb_csw *)((ulong)csw - 2);
> +       printf("csw org %p, adjusted to %p\n", csw_org, csw); #endif
>  #ifdef BBB_XPORT_TRACE
>         unsigned char *ptr;
>         int index;
> @@ -824,6 +832,7 @@ again:
>                 return USB_STOR_TRANSPORT_FAILED;
>         }
> 
> +       free(csw_org);
>         return result;
>  }
> 

Issue cannot be reproduced on mass storage device with above hacking:
=>
resetting USB...
Bus usb3@3100000: Register 200017f NbrPorts 2
Starting the controller
USB XHCI 1.00
Bus usb3@3110000: Register 200017f NbrPorts 2
Starting the controller
USB XHCI 1.00
scanning bus usb3@3100000 for devices... record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 13
record_transfer_result():comp_code: 13
record_transfer_result():comp_code: 13
record_transfer_result():comp_code: 13
record_transfer_result():comp_code: 1
csw org 00000000fbb53010, adjusted to 00000000fbb5fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org 00000000fbb53010, adjusted to 00000000fbb5fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org 00000000fbb53010, adjusted to 00000000fbb5fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
2 USB Device(s) found
scanning bus usb3@3110000 for devices... 1 USB Device(s) found
       scanning usb for storage devices... 1 Storage Device(s) found
=>

One more thing I'd like to tell is that I found this issue would not happened with 'usb start' (even with USB dongle).

Below is the log I tested with ' usb read 0x82000000 0 300000', issue could not be reproduced for at least 10 minutes
...
csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
record_transfer_result():comp_code: 1
...

Regards,
Ran
Bin Meng Nov. 10, 2020, 9:47 a.m. UTC | #12
Hi Ran,

On Tue, Nov 10, 2020 at 5:41 PM Ran Wang <ran.wang_1@nxp.com> wrote:
>
> Hi Bin,
>
> On Tuesday, November 10, 2020 5:03 PM Bin Meng wrote:
> >
> > Hi Ran,
> >
> > On Tue, Nov 10, 2020 at 4:31 PM Ran Wang <ran.wang_1@nxp.com> wrote:
> > >
> > > Hi Bin,
> > >
> > > On Tuesday, November 10, 2020 4:06 PM Ran Wang wrote:
> > > > Hi Bin,
>
> <snip>
>
> > >
> >
> > Based on your log, it matches what I suspected and we are getting close to the
> > root cause.
> >
> > Could you please try the following simple test case to see if it can trigger the
> > issue with a USB disk?
> > After U-Boot boot on LS1028, with a USB flash disk attached, then type "usb
> > start" to see if the USB disk can be recognized by U-Boot?
> >
> > commit 60e68ed667362870c20b36ae26dacc1af903e81e
> > Author: Bin Meng <bmeng.cn@gmail.com>
> > Date:   Tue Nov 10 16:19:06 2020 +0800
> >
> >     WIP: usb: A simple test case to trigger the TRB 64K boundary issue with
> > mass storage device
> >
> >     This should not be applied as it aims to provide a simple test case to
> >     trigger the TRB 64K boundary issue as reported by Ran Wang @ NXP.
> >
> >     Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
> >
> > diff --git a/common/usb_storage.c b/common/usb_storage.c index
> > ff25441..c8aec2e 100644
> > --- a/common/usb_storage.c
> > +++ b/common/usb_storage.c
> > @@ -710,7 +710,15 @@ static int usb_stor_BBB_transport(struct scsi_cmd
> > *srb, struct us_data *us)
> >         int dir_in;
> >         int actlen, data_actlen;
> >         unsigned int pipe, pipein, pipeout;
> > +#if 0
> >         ALLOC_CACHE_ALIGN_BUFFER(struct umass_bbb_csw, csw, 1);
> > +#else
> > +       struct umass_bbb_csw *csw_org, *csw;
> > +       csw_org = malloc(0x10000 + UMASS_BBB_CSW_SIZE);
> > +       csw = (struct umass_bbb_csw *)roundup((ulong)csw_org, 0x10000);
> > +       csw = (struct umass_bbb_csw *)((ulong)csw - 2);
> > +       printf("csw org %p, adjusted to %p\n", csw_org, csw); #endif
> >  #ifdef BBB_XPORT_TRACE
> >         unsigned char *ptr;
> >         int index;
> > @@ -824,6 +832,7 @@ again:
> >                 return USB_STOR_TRANSPORT_FAILED;
> >         }
> >
> > +       free(csw_org);
> >         return result;
> >  }
> >
>
> Issue cannot be reproduced on mass storage device with above hacking:

Thank you for the testing.

> =>
> resetting USB...
> Bus usb3@3100000: Register 200017f NbrPorts 2
> Starting the controller
> USB XHCI 1.00
> Bus usb3@3110000: Register 200017f NbrPorts 2
> Starting the controller
> USB XHCI 1.00
> scanning bus usb3@3100000 for devices... record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 13
> record_transfer_result():comp_code: 13
> record_transfer_result():comp_code: 13
> record_transfer_result():comp_code: 13
> record_transfer_result():comp_code: 1
> csw org 00000000fbb53010, adjusted to 00000000fbb5fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org 00000000fbb53010, adjusted to 00000000fbb5fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org 00000000fbb53010, adjusted to 00000000fbb5fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> 2 USB Device(s) found
> scanning bus usb3@3110000 for devices... 1 USB Device(s) found
>        scanning usb for storage devices... 1 Storage Device(s) found
> =>
>
> One more thing I'd like to tell is that I found this issue would not happened with 'usb start' (even with USB dongle).
>
> Below is the log I tested with ' usb read 0x82000000 0 300000', issue could not be reproduced for at least 10 minutes
> ...
> csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1
> record_transfer_result():comp_code: 1

In the earlier log you sent, you have the following:

dev=00000000fbb4f3c0, pipe=c0010283, buffer=00000000fbb4fd80,
length=2048----------------------------buffer would cross 64KB
boundary, so we will send request in more than 1 TRB, this is the key
issue trigger condition
xhci_bulk_tx()#0.1.running_total:0x280
xhci_bulk_tx()#0.2.trb_buff_len:0x280
xhci_bulk_tx()#0.3.running_total:0x280
xhci_bulk_tx()#0.4.num_trbs:0x2--------------------------2 Transfer TRB
xhci_bulk_tx()#0.5.running_total:0x10280
xhci_bulk_tx()#0.start_trb:0x00000000fbb47140
----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47140----------Assemble
1st Transfer TRB
xhci_bulk_tx()#0.addr:0xfbb4fd80
xhci_bulk_tx()#0.trb_buff_len:0x280
xhci_bulk_tx()#0.running_total:0x280
xhci_bulk_tx()#0.length:0x800
xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000
----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47150----------Assemble
2nd Transfer TRB
xhci_bulk_tx()#0.addr:0xfbb50000
xhci_bulk_tx()#0.trb_buff_len:0x580
xhci_bulk_tx()#0.running_total:0x800
xhci_bulk_tx()#0.length:0x800
xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000

Could you please specify where this output is for?

"xhci_bulk_tx()#0.length:0x800"

Regards,
Bin
Ran Wang Nov. 10, 2020, 9:54 a.m. UTC | #13
Hi Bin,

On Tuesday, November 10, 2020 5:47 PM Bin Meng wrote:
> 
> Hi Ran,
> 
> On Tue, Nov 10, 2020 at 5:41 PM Ran Wang <ran.wang_1@nxp.com> wrote:
> >
> > Hi Bin,
> >
> > On Tuesday, November 10, 2020 5:03 PM Bin Meng wrote:
> > >
> > > Hi Ran,
> > >
> > > On Tue, Nov 10, 2020 at 4:31 PM Ran Wang <ran.wang_1@nxp.com>
> wrote:
> > > >
> > > > Hi Bin,
> > > >
> > > > On Tuesday, November 10, 2020 4:06 PM Ran Wang wrote:
> > > > > Hi Bin,
> >
> > <snip>
> >
> > > >
> > >
> > > Based on your log, it matches what I suspected and we are getting
> > > close to the root cause.
> > >
> > > Could you please try the following simple test case to see if it can
> > > trigger the issue with a USB disk?
> > > After U-Boot boot on LS1028, with a USB flash disk attached, then
> > > type "usb start" to see if the USB disk can be recognized by U-Boot?
> > >
> > > commit 60e68ed667362870c20b36ae26dacc1af903e81e
> > > Author: Bin Meng <bmeng.cn@gmail.com>
> > > Date:   Tue Nov 10 16:19:06 2020 +0800
> > >
> > >     WIP: usb: A simple test case to trigger the TRB 64K boundary
> > > issue with mass storage device
> > >
> > >     This should not be applied as it aims to provide a simple test case to
> > >     trigger the TRB 64K boundary issue as reported by Ran Wang @ NXP.
> > >
> > >     Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
> > >
> > > diff --git a/common/usb_storage.c b/common/usb_storage.c index
> > > ff25441..c8aec2e 100644
> > > --- a/common/usb_storage.c
> > > +++ b/common/usb_storage.c
> > > @@ -710,7 +710,15 @@ static int usb_stor_BBB_transport(struct
> > > scsi_cmd *srb, struct us_data *us)
> > >         int dir_in;
> > >         int actlen, data_actlen;
> > >         unsigned int pipe, pipein, pipeout;
> > > +#if 0
> > >         ALLOC_CACHE_ALIGN_BUFFER(struct umass_bbb_csw, csw, 1);
> > > +#else
> > > +       struct umass_bbb_csw *csw_org, *csw;
> > > +       csw_org = malloc(0x10000 + UMASS_BBB_CSW_SIZE);
> > > +       csw = (struct umass_bbb_csw *)roundup((ulong)csw_org,
> 0x10000);
> > > +       csw = (struct umass_bbb_csw *)((ulong)csw - 2);
> > > +       printf("csw org %p, adjusted to %p\n", csw_org, csw); #endif
> > >  #ifdef BBB_XPORT_TRACE
> > >         unsigned char *ptr;
> > >         int index;
> > > @@ -824,6 +832,7 @@ again:
> > >                 return USB_STOR_TRANSPORT_FAILED;
> > >         }
> > >
> > > +       free(csw_org);
> > >         return result;
> > >  }
> > >
> >
> > Issue cannot be reproduced on mass storage device with above hacking:
> 
> Thank you for the testing.
> 
> > =>
> > resetting USB...
> > Bus usb3@3100000: Register 200017f NbrPorts 2 Starting the controller
> > USB XHCI 1.00 Bus usb3@3110000: Register 200017f NbrPorts 2 Starting
> > the controller USB XHCI 1.00 scanning bus usb3@3100000 for devices...
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 13
> > record_transfer_result():comp_code: 13
> > record_transfer_result():comp_code: 13
> > record_transfer_result():comp_code: 13
> > record_transfer_result():comp_code: 1
> > csw org 00000000fbb53010, adjusted to 00000000fbb5fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > csw org 00000000fbb53010, adjusted to 00000000fbb5fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > csw org 00000000fbb53010, adjusted to 00000000fbb5fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > 2 USB Device(s) found
> > scanning bus usb3@3110000 for devices... 1 USB Device(s) found
> >        scanning usb for storage devices... 1 Storage Device(s) found
> > =>
> >
> > One more thing I'd like to tell is that I found this issue would not happened
> with 'usb start' (even with USB dongle).
> >
> > Below is the log I tested with ' usb read 0x82000000 0 300000', issue
> > could not be reproduced for at least 10 minutes ...
> > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > csw org 00000000fbb4ec90, adjusted to 00000000fbb4fffe
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> > record_transfer_result():comp_code: 1
> 
> In the earlier log you sent, you have the following:
> 
> dev=00000000fbb4f3c0, pipe=c0010283, buffer=00000000fbb4fd80,
> length=2048----------------------------buffer would cross 64KB boundary, so we will
> send request in more than 1 TRB, this is the key issue trigger condition
> xhci_bulk_tx()#0.1.running_total:0x280
> xhci_bulk_tx()#0.2.trb_buff_len:0x280
> xhci_bulk_tx()#0.3.running_total:0x280
> xhci_bulk_tx()#0.4.num_trbs:0x2--------------------------2 Transfer TRB
> xhci_bulk_tx()#0.5.running_total:0x10280
> xhci_bulk_tx()#0.start_trb:0x00000000fbb47140
> ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47140---------
> -Assemble
> 1st Transfer TRB
> xhci_bulk_tx()#0.addr:0xfbb4fd80
> xhci_bulk_tx()#0.trb_buff_len:0x280
> xhci_bulk_tx()#0.running_total:0x280
> xhci_bulk_tx()#0.length:0x800
> xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000
> ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47150---------
> -Assemble
> 2nd Transfer TRB
> xhci_bulk_tx()#0.addr:0xfbb50000
> xhci_bulk_tx()#0.trb_buff_len:0x580
> xhci_bulk_tx()#0.running_total:0x800
> xhci_bulk_tx()#0.length:0x800
> xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000
> 
> Could you please specify where this output is for?

All this prints is within function xhci_bulk_tx(),

Let me open those verbose output and check if that crossing 64KB case has been triggered or not.
But I have to get back to you later (tomorrow).

Regards,
Ran

> "xhci_bulk_tx()#0.length:0x800"
> 
> Regards,
> Bin
Ran Wang Nov. 11, 2020, 8:07 a.m. UTC | #14
Hi Bin

On Tuesday, November 10, 2020 5:55 PM, Ran Wang wrote:
> 
> Hi Bin,
> 

<snip>

> > > record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90,
> > > adjusted to 00000000fbb4fffe
> > > record_transfer_result():comp_code: 1
> > > record_transfer_result():comp_code: 1
> > > record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90,
> > > adjusted to 00000000fbb4fffe
> > > record_transfer_result():comp_code: 1
> > > record_transfer_result():comp_code: 1
> > > record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90,
> > > adjusted to 00000000fbb4fffe
> > > record_transfer_result():comp_code: 1
> > > record_transfer_result():comp_code: 1
> > > record_transfer_result():comp_code: 1
> >
> > In the earlier log you sent, you have the following:
> >
> > dev=00000000fbb4f3c0, pipe=c0010283, buffer=00000000fbb4fd80,
> > length=2048----------------------------buffer would cross 64KB
> > boundary, so we will send request in more than 1 TRB, this is the key
> > issue trigger condition
> > xhci_bulk_tx()#0.1.running_total:0x280
> > xhci_bulk_tx()#0.2.trb_buff_len:0x280
> > xhci_bulk_tx()#0.3.running_total:0x280
> > xhci_bulk_tx()#0.4.num_trbs:0x2--------------------------2 Transfer
> > TRB
> > xhci_bulk_tx()#0.5.running_total:0x10280
> > xhci_bulk_tx()#0.start_trb:0x00000000fbb47140
> > ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47140-
> > --------
> > -Assemble
> > 1st Transfer TRB
> > xhci_bulk_tx()#0.addr:0xfbb4fd80
> > xhci_bulk_tx()#0.trb_buff_len:0x280
> > xhci_bulk_tx()#0.running_total:0x280
> > xhci_bulk_tx()#0.length:0x800
> > xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000
> > ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47150-
> > --------
> > -Assemble
> > 2nd Transfer TRB
> > xhci_bulk_tx()#0.addr:0xfbb50000
> > xhci_bulk_tx()#0.trb_buff_len:0x580
> > xhci_bulk_tx()#0.running_total:0x800
> > xhci_bulk_tx()#0.length:0x800
> > xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000
> >
> > Could you please specify where this output is for?
> 
> All this prints is within function xhci_bulk_tx(),
> 
> Let me open those verbose output and check if that crossing 64KB case has been
> triggered or not.
> But I have to get back to you later (tomorrow).

With your test commit and my verbose print code, issue cannot be reproduced in mass storage usage
(although crossing 64KB boundary condition has met), so looks like the only different to ethernet device
is that if xHC would receive short packet:

=> pri t
t=usb read 0x82000000 0 300000
=> run t

usb read: device 0 block # 0, count 3145728 ... 
usb_stor_BBB_transport()csw org 00000000fbb4ac90, adjusted to 00000000fbb4fffe
xhci_bulk_tx()dev=00000000fbb49950, pipe=c0010203, buffer=00000000fbb1a9c0, length=0x1f
xhci_bulk_tx()#0.4.num_trbs:0x1
++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb49570
xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49570
--------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb49570
xhci_bulk_tx()#0.event->complete_code:1
xhci_bulk_tx()#3.ep_index:0x3
xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283, buffer=00000000fbb1ab00, length=0x200
xhci_bulk_tx()#0.4.num_trbs:0x1
xhci_bulk_tx()#set TRB_ISP------------------------------------ 
++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb49180
xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49180
--------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb49180
xhci_bulk_tx()#0.event->complete_code:1
xhci_bulk_tx()#3.ep_index:0x2
xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283, buffer=00000000fbb4fffe, length=0xd-----------------buffer cross 64KB boundary
xhci_bulk_tx()#0.4.num_trbs:0x2
xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP
++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb49190----------------1st transfer TRB queued
xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49190
xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP
++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb491a0----------------2nd transfer TRB queued
xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb491a0
--------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb491a0---------------------event TRB that pointing to 2nd transfer TRB
xhci_bulk_tx()#0.event->complete_code:1-----------------------------------------------------------------not short packet
xhci_bulk_tx()#3.ep_index:0x2
num_trbs_count > 1!!!!!!!!!
...
usb_stor_BBB_transport()csw org 00000000fbb4aea0, adjusted to 00000000fbb4fffe
xhci_bulk_tx()dev=00000000fbb49950, pipe=c0010203, buffer=00000000fbb1ad00, length=0x1f
xhci_bulk_tx()#0.4.num_trbs:0x1
++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb49790
xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49790
--------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb49790
xhci_bulk_tx()#0.event->complete_code:1
xhci_bulk_tx()#3.ep_index:0x3
xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283, buffer=00000000883d8000, length=0x1e000-----------------buffer cross 64KB boundary
xhci_bulk_tx()#0.4.num_trbs:0x3
xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP
++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492a0----------------1st transfer TRB queued
xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492a0
xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP
++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492b0----------------2nd transfer TRB queued
xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492b0
xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP
++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492c0----------------3rd transfer TRB queued
xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492c0
--------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb492c0---------------------event TRB that pointing to 3rd transfer TRB
xhci_bulk_tx()#0.event->complete_code:1-----------------------------------------------------------------not short packet
xhci_bulk_tx()#3.ep_index:0x2
num_trbs_count > 1!!!!!!!!!
xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283, buffer=00000000fbb4fffe, length=0xd
xhci_bulk_tx()#0.4.num_trbs:0x2
xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP
++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492d0
xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492d0
xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP
++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492e0
xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492e0
--------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb492e0
xhci_bulk_tx()#0.event->complete_code:1
xhci_bulk_tx()#3.ep_index:0x2
num_trbs_count > 1!!!!!!!!!

Regards
Ran
Bin Meng Nov. 13, 2020, 6:25 a.m. UTC | #15
Hi Ran,

On Wed, Nov 11, 2020 at 4:07 PM Ran Wang <ran.wang_1@nxp.com> wrote:
>
> Hi Bin
>
> On Tuesday, November 10, 2020 5:55 PM, Ran Wang wrote:
> >
> > Hi Bin,
> >
>
> <snip>
>
> > > > record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90,
> > > > adjusted to 00000000fbb4fffe
> > > > record_transfer_result():comp_code: 1
> > > > record_transfer_result():comp_code: 1
> > > > record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90,
> > > > adjusted to 00000000fbb4fffe
> > > > record_transfer_result():comp_code: 1
> > > > record_transfer_result():comp_code: 1
> > > > record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90,
> > > > adjusted to 00000000fbb4fffe
> > > > record_transfer_result():comp_code: 1
> > > > record_transfer_result():comp_code: 1
> > > > record_transfer_result():comp_code: 1
> > >
> > > In the earlier log you sent, you have the following:
> > >
> > > dev=00000000fbb4f3c0, pipe=c0010283, buffer=00000000fbb4fd80,
> > > length=2048----------------------------buffer would cross 64KB
> > > boundary, so we will send request in more than 1 TRB, this is the key
> > > issue trigger condition
> > > xhci_bulk_tx()#0.1.running_total:0x280
> > > xhci_bulk_tx()#0.2.trb_buff_len:0x280
> > > xhci_bulk_tx()#0.3.running_total:0x280
> > > xhci_bulk_tx()#0.4.num_trbs:0x2--------------------------2 Transfer
> > > TRB
> > > xhci_bulk_tx()#0.5.running_total:0x10280
> > > xhci_bulk_tx()#0.start_trb:0x00000000fbb47140
> > > ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47140-
> > > --------
> > > -Assemble
> > > 1st Transfer TRB
> > > xhci_bulk_tx()#0.addr:0xfbb4fd80
> > > xhci_bulk_tx()#0.trb_buff_len:0x280
> > > xhci_bulk_tx()#0.running_total:0x280
> > > xhci_bulk_tx()#0.length:0x800
> > > xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000
> > > ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47150-
> > > --------
> > > -Assemble
> > > 2nd Transfer TRB
> > > xhci_bulk_tx()#0.addr:0xfbb50000
> > > xhci_bulk_tx()#0.trb_buff_len:0x580
> > > xhci_bulk_tx()#0.running_total:0x800
> > > xhci_bulk_tx()#0.length:0x800
> > > xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000
> > >
> > > Could you please specify where this output is for?
> >
> > All this prints is within function xhci_bulk_tx(),
> >
> > Let me open those verbose output and check if that crossing 64KB case has been
> > triggered or not.
> > But I have to get back to you later (tomorrow).
>
> With your test commit and my verbose print code, issue cannot be reproduced in mass storage usage
> (although crossing 64KB boundary condition has met), so looks like the only different to ethernet device
> is that if xHC would receive short packet:
>

Thank you.

> => pri t
> t=usb read 0x82000000 0 300000
> => run t
>
> usb read: device 0 block # 0, count 3145728 ...
> usb_stor_BBB_transport()csw org 00000000fbb4ac90, adjusted to 00000000fbb4fffe
> xhci_bulk_tx()dev=00000000fbb49950, pipe=c0010203, buffer=00000000fbb1a9c0, length=0x1f
> xhci_bulk_tx()#0.4.num_trbs:0x1
> ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb49570
> xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49570
> --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb49570
> xhci_bulk_tx()#0.event->complete_code:1
> xhci_bulk_tx()#3.ep_index:0x3
> xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283, buffer=00000000fbb1ab00, length=0x200
> xhci_bulk_tx()#0.4.num_trbs:0x1
> xhci_bulk_tx()#set TRB_ISP------------------------------------
> ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb49180
> xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49180
> --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb49180
> xhci_bulk_tx()#0.event->complete_code:1
> xhci_bulk_tx()#3.ep_index:0x2
> xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283, buffer=00000000fbb4fffe, length=0xd-----------------buffer cross 64KB boundary
> xhci_bulk_tx()#0.4.num_trbs:0x2
> xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP
> ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb49190----------------1st transfer TRB queued
> xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49190
> xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP
> ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb491a0----------------2nd transfer TRB queued
> xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb491a0
> --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb491a0---------------------event TRB that pointing to 2nd transfer TRB
> xhci_bulk_tx()#0.event->complete_code:1-----------------------------------------------------------------not short packet
> xhci_bulk_tx()#3.ep_index:0x2
> num_trbs_count > 1!!!!!!!!!
> ...
> usb_stor_BBB_transport()csw org 00000000fbb4aea0, adjusted to 00000000fbb4fffe
> xhci_bulk_tx()dev=00000000fbb49950, pipe=c0010203, buffer=00000000fbb1ad00, length=0x1f
> xhci_bulk_tx()#0.4.num_trbs:0x1
> ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb49790
> xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49790
> --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb49790
> xhci_bulk_tx()#0.event->complete_code:1
> xhci_bulk_tx()#3.ep_index:0x3
> xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283, buffer=00000000883d8000, length=0x1e000-----------------buffer cross 64KB boundary
> xhci_bulk_tx()#0.4.num_trbs:0x3
> xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP
> ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492a0----------------1st transfer TRB queued
> xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492a0
> xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP
> ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492b0----------------2nd transfer TRB queued
> xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492b0
> xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP
> ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492c0----------------3rd transfer TRB queued
> xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492c0
> --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb492c0---------------------event TRB that pointing to 3rd transfer TRB
> xhci_bulk_tx()#0.event->complete_code:1-----------------------------------------------------------------not short packet
> xhci_bulk_tx()#3.ep_index:0x2
> num_trbs_count > 1!!!!!!!!!
> xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283, buffer=00000000fbb4fffe, length=0xd
> xhci_bulk_tx()#0.4.num_trbs:0x2
> xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP
> ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492d0
> xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492d0
> xhci_bulk_tx()#set TRB_ISP---------------------------------------------------------------------------------------Set ISP
> ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492e0
> xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492e0
> --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb492e0
> xhci_bulk_tx()#0.event->complete_code:1
> xhci_bulk_tx()#3.ep_index:0x2
> num_trbs_count > 1!!!!!!!!!

Please rebase this patch against u-boot/master, and reword the
following commit message to indicate that it was because of the ISP
flag set on an input pipe. In the 64K boundary case, when the first
TRB ends up with a short packet condition it will trigger an short
packet code transfer event per ISP flag, and current codes will only
handle the first transfer event.

> However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe xhci
> controller will generated more than 1 event TRB sometimes, this cause that
> function mishandle event TRB in next round call, then system hang due to
> BUG() checking.

Regards,
Bin
Ran Wang Nov. 13, 2020, 6:48 a.m. UTC | #16
Hi Bin,

On Friday, November 13, 2020 2:25 PM Bin Meng wrote:
> 
> Hi Ran,
> 
> On Wed, Nov 11, 2020 at 4:07 PM Ran Wang <ran.wang_1@nxp.com> wrote:
> >
> > Hi Bin
> >
> > On Tuesday, November 10, 2020 5:55 PM, Ran Wang wrote:
> > >
> > > Hi Bin,
> > >
> >
> > <snip>
> >
> > > > > record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90,
> > > > > adjusted to 00000000fbb4fffe
> > > > > record_transfer_result():comp_code: 1
> > > > > record_transfer_result():comp_code: 1
> > > > > record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90,
> > > > > adjusted to 00000000fbb4fffe
> > > > > record_transfer_result():comp_code: 1
> > > > > record_transfer_result():comp_code: 1
> > > > > record_transfer_result():comp_code: 1 csw org 00000000fbb4ec90,
> > > > > adjusted to 00000000fbb4fffe
> > > > > record_transfer_result():comp_code: 1
> > > > > record_transfer_result():comp_code: 1
> > > > > record_transfer_result():comp_code: 1
> > > >
> > > > In the earlier log you sent, you have the following:
> > > >
> > > > dev=00000000fbb4f3c0, pipe=c0010283, buffer=00000000fbb4fd80,
> > > > length=2048----------------------------buffer would cross 64KB
> > > > boundary, so we will send request in more than 1 TRB, this is the
> > > > key issue trigger condition
> > > > xhci_bulk_tx()#0.1.running_total:0x280
> > > > xhci_bulk_tx()#0.2.trb_buff_len:0x280
> > > > xhci_bulk_tx()#0.3.running_total:0x280
> > > > xhci_bulk_tx()#0.4.num_trbs:0x2--------------------------2
> > > > Transfer TRB
> > > > xhci_bulk_tx()#0.5.running_total:0x10280
> > > > xhci_bulk_tx()#0.start_trb:0x00000000fbb47140
> > > > ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47
> > > > 140-
> > > > --------
> > > > -Assemble
> > > > 1st Transfer TRB
> > > > xhci_bulk_tx()#0.addr:0xfbb4fd80
> > > > xhci_bulk_tx()#0.trb_buff_len:0x280
> > > > xhci_bulk_tx()#0.running_total:0x280
> > > > xhci_bulk_tx()#0.length:0x800
> > > > xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000
> > > > ----------xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb47
> > > > 150-
> > > > --------
> > > > -Assemble
> > > > 2nd Transfer TRB
> > > > xhci_bulk_tx()#0.addr:0xfbb50000
> > > > xhci_bulk_tx()#0.trb_buff_len:0x580
> > > > xhci_bulk_tx()#0.running_total:0x800
> > > > xhci_bulk_tx()#0.length:0x800
> > > > xhci_bulk_tx()#0.TRB_MAX_BUFF_SIZE:0x10000
> > > >
> > > > Could you please specify where this output is for?
> > >
> > > All this prints is within function xhci_bulk_tx(),
> > >
> > > Let me open those verbose output and check if that crossing 64KB
> > > case has been triggered or not.
> > > But I have to get back to you later (tomorrow).
> >
> > With your test commit and my verbose print code, issue cannot be
> > reproduced in mass storage usage (although crossing 64KB boundary
> > condition has met), so looks like the only different to ethernet device is that if
> xHC would receive short packet:
> >
> 
> Thank you.
> 
> > => pri t
> > t=usb read 0x82000000 0 300000
> > => run t
> >
> > usb read: device 0 block # 0, count 3145728 ...
> > usb_stor_BBB_transport()csw org 00000000fbb4ac90, adjusted to
> > 00000000fbb4fffe xhci_bulk_tx()dev=00000000fbb49950, pipe=c0010203,
> > buffer=00000000fbb1a9c0, length=0x1f
> > xhci_bulk_tx()#0.4.num_trbs:0x1
> >
> ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb4957
> > ++++++++++++0
> > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49570
> > --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb
> > 49570
> > xhci_bulk_tx()#0.event->complete_code:1
> > xhci_bulk_tx()#3.ep_index:0x3
> > xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283,
> > buffer=00000000fbb1ab00, length=0x200
> > xhci_bulk_tx()#0.4.num_trbs:0x1
> > xhci_bulk_tx()#set TRB_ISP------------------------------------
> >
> ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb4918
> > ++++++++++++0
> > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49180
> > --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb
> > 49180
> > xhci_bulk_tx()#0.event->complete_code:1
> > xhci_bulk_tx()#3.ep_index:0x2
> > xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283,
> > buffer=00000000fbb4fffe, length=0xd-----------------buffer cross 64KB
> > boundary
> > xhci_bulk_tx()#0.4.num_trbs:0x2
> > xhci_bulk_tx()#set
> > TRB_ISP---------------------------------------------------------------
> > ------------------------Set ISP
> >
> ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb4919
> > ++++++++++++0----------------1st transfer TRB queued
> > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49190
> > xhci_bulk_tx()#set
> > TRB_ISP---------------------------------------------------------------
> > ------------------------Set ISP
> >
> ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb491a
> > ++++++++++++0----------------2nd transfer TRB queued
> > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb491a0
> > --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb
> > 491a0---------------------event TRB that pointing to 2nd transfer TRB
> > xhci_bulk_tx()#0.event->complete_code:1-------------------------------
> > ----------------------------------not short packet
> > xhci_bulk_tx()#3.ep_index:0x2
> > num_trbs_count > 1!!!!!!!!!
> > ...
> > usb_stor_BBB_transport()csw org 00000000fbb4aea0, adjusted to
> > 00000000fbb4fffe xhci_bulk_tx()dev=00000000fbb49950, pipe=c0010203,
> > buffer=00000000fbb1ad00, length=0x1f
> > xhci_bulk_tx()#0.4.num_trbs:0x1
> >
> ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb4979
> > ++++++++++++0
> > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb49790
> > --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb
> > 49790
> > xhci_bulk_tx()#0.event->complete_code:1
> > xhci_bulk_tx()#3.ep_index:0x3
> > xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283,
> > buffer=00000000883d8000, length=0x1e000-----------------buffer cross
> > 64KB boundary
> > xhci_bulk_tx()#0.4.num_trbs:0x3
> > xhci_bulk_tx()#set
> > TRB_ISP---------------------------------------------------------------
> > ------------------------Set ISP
> >
> ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492a
> > ++++++++++++0----------------1st transfer TRB queued
> > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492a0
> > xhci_bulk_tx()#set
> > TRB_ISP---------------------------------------------------------------
> > ------------------------Set ISP
> >
> ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492b
> > ++++++++++++0----------------2nd transfer TRB queued
> > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492b0
> > xhci_bulk_tx()#set
> > TRB_ISP---------------------------------------------------------------
> > ------------------------Set ISP
> >
> ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492c
> > ++++++++++++0----------------3rd transfer TRB queued
> > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492c0
> > --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb
> > 492c0---------------------event TRB that pointing to 3rd transfer TRB
> > xhci_bulk_tx()#0.event->complete_code:1-------------------------------
> > ----------------------------------not short packet
> > xhci_bulk_tx()#3.ep_index:0x2
> > num_trbs_count > 1!!!!!!!!!
> > xhci_bulk_tx()dev=00000000fbb49950, pipe=c0008283,
> > buffer=00000000fbb4fffe, length=0xd
> > xhci_bulk_tx()#0.4.num_trbs:0x2
> > xhci_bulk_tx()#set
> > TRB_ISP---------------------------------------------------------------
> > ------------------------Set ISP
> >
> ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492d
> > ++++++++++++0
> > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492d0
> > xhci_bulk_tx()#set
> > TRB_ISP---------------------------------------------------------------
> > ------------------------Set ISP
> >
> ++++++++++++xhci_bulk_tx()#0.&ring->enqueue->generic:0x00000000fbb492e
> > ++++++++++++0
> > xhci_bulk_tx()#0.last_transfer_trb_addr:0x00000000fbb492e0
> > --------------xhci_bulk_tx()#0.event->trans_event.buffer:0x00000000fbb
> > 492e0
> > xhci_bulk_tx()#0.event->complete_code:1
> > xhci_bulk_tx()#3.ep_index:0x2
> > num_trbs_count > 1!!!!!!!!!
> 
> Please rebase this patch against u-boot/master, and reword the following
> commit message to indicate that it was because of the ISP flag set on an input
> pipe. In the 64K boundary case, when the first TRB ends up with a short packet
> condition it will trigger an short packet code transfer event per ISP flag, and
> current codes will only handle the first transfer event.

Got it, will work out v2, thanks for the review.

Regards,
Ran

> > However, on Layerscape platforms (LS1028A, LS1088A, etc), we observe
> > xhci controller will generated more than 1 event TRB sometimes, this
> > cause that function mishandle event TRB in next round call, then
> > system hang due to
> > BUG() checking.
> 
> Regards,
> Bin
diff mbox series

Patch

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 092ed6e..d77e058 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -578,10 +578,13 @@  int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe,
 	int ret;
 	u32 trb_fields[4];
 	u64 val_64 = virt_to_phys(buffer);
+	void *last_transfer_trb_addr;
+	int available_length;
 
 	debug("dev=%p, pipe=%lx, buffer=%p, length=%d\n",
 		udev, pipe, buffer, length);
 
+	available_length = length;
 	ep_index = usb_pipe_ep_index(pipe);
 	virt_dev = ctrl->devs[slot_id];
 
@@ -701,7 +704,7 @@  int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe,
 		trb_fields[2] = length_field;
 		trb_fields[3] = field | (TRB_NORMAL << TRB_TYPE_SHIFT);
 
-		queue_trb(ctrl, ring, (num_trbs > 1), trb_fields);
+		last_transfer_trb_addr = queue_trb(ctrl, ring, (num_trbs > 1), trb_fields);
 
 		--num_trbs;
 
@@ -714,6 +717,7 @@  int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe,
 
 	giveback_first_trb(udev, ep_index, start_cycle, start_trb);
 
+again:
 	event = xhci_wait_for_event(ctrl, TRB_TRANSFER);
 	if (!event) {
 		debug("XHCI bulk transfer timed out, aborting...\n");
@@ -722,14 +726,21 @@  int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe,
 		udev->act_len = 0;
 		return -ETIMEDOUT;
 	}
-	field = le32_to_cpu(event->trans_event.flags);
 
+	if ((void *)event->trans_event.buffer != last_transfer_trb_addr) {
+		available_length -=
+			(int)EVENT_TRB_LEN(le32_to_cpu(event->trans_event.transfer_len));
+		xhci_acknowledge_event(ctrl);
+		goto again;
+	}
+
+	field = le32_to_cpu(event->trans_event.flags);
 	BUG_ON(TRB_TO_SLOT_ID(field) != slot_id);
 	BUG_ON(TRB_TO_EP_INDEX(field) != ep_index);
 	BUG_ON(*(void **)(uintptr_t)le64_to_cpu(event->trans_event.buffer) -
 		buffer > (size_t)length);
 
-	record_transfer_result(udev, event, length);
+	record_transfer_result(udev, event, available_length);
 	xhci_acknowledge_event(ctrl);
 	xhci_inval_cache((uintptr_t)buffer, length);